Re: realise diff-updates with dpkg

2013-02-22 Thread iceWave IT

 I don't know how big your squidguard blacklists are (its a good idea to
 include details when asking questions), but the largest one I could find
 was 20MB [...]

My biggest list is 22MB and gets daily updates. But because it uses much
RAM when be loaded I check all entrys one time per week so that only online
servers are in these list. So the total size is now about 16MB. All in all
that are 16 Lists between 1 and 16 MB. No problem for your internet
connection - the connection of my client sometimes gets only 42kb/s!


Anyway, rsync sounds like the most appropriate mechanism to transfer these
 particular databases.

My blacklists should be available for everyone not only for those who can
connect with my server via ssh...


Re: realise diff-updates with dpkg

2013-02-22 Thread Peter Samuelson

  Anyway, rsync sounds like the most appropriate mechanism to
  transfer these particular databases.

[iceWave IT]
 My blacklists should be available for everyone not only for those who
 can connect with my server via ssh...

rsync doesn't require ssh; for your scenario you probably just want
'rsync --daemon' from inetd.  And yes, this is very much a job for
rsync.  Or go with zsync, if you want to use http as a transport.
Don't try to emulate rsync with dpkg, it will not go well.  dpkg
doesn't add anything you can't get with a little scripting (and
producing your incremental debs definitely requires some scripting
too).

(If reloading a full dataset into your database is _that_ expensive,
what you want to do is 'ln'-backup the old file, rsync the new, 'diff'
them, and use the diff output to generate a database import script.)


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130222193824.gt4...@p12n.org



Re: realise diff-updates with dpkg

2013-02-21 Thread iceWave IT
 A DNSBL is the traditional solution for blacklists, why are you
 putting your blacklist in a .deb?

I meant blacklists specially for squidguard. That are hughe files with
domains / URLs inside. So e.g. porn could be blocked in your network.


 What if a node misses an update, then?
 And what if it's not turned on for a month, or gets restored from an old
backup?

DPKG does nothing than execute the postinst script - in DEBIAN/control is
information about from which to which version this Updates works (e.g. from
Version 1 to Version 2) - if the actual Version is 0.9 DPKG cant execute
the script because there is not the right Version installed. To know which
update has to be installed an front end like aptittue has to know which
updates are avivable and gets this updates. If the client was for a long
time offline the front end maybe doesn't find all updates to update from
the installed version to the actual version, so the complete package has to
be downloaded again.


Re: realise diff-updates with dpkg

2013-02-21 Thread Paul Wise
I don't know how big your squidguard blacklists are (its a good idea
to include details when asking questions), but the largest one I could
find was 20MB, much smaller than some of the packages I maintain in
Debian, let alone the largest package in Debian. Anyway, rsync sounds
like the most appropriate mechanism to transfer these particular
databases.

-- 
bye,
pabs

http://wiki.debian.org/PaulWise


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/CAKTje6EXT6=9dwbfjzvcrf-fy7qdn5s3wpex3tuf5_a0nkx...@mail.gmail.com



realise diff-updates with dpkg

2013-02-20 Thread iceWave IT
Hi ;)

I would like to imagine a time approach, could be made easier with the
updates.

With daily changing databases like virus database, blacklists, card or
similar is always a part of the old data is erased as new data are added.
Programs such as clamav bring it with their own update client.
But I would however also like to use dpkg. So then every day just a new
package with the new database created with aptitude to the clients can then
load them down. For small data sets, this is also possible, but it occurs
at greater length on a very high volume of data that is not on slow
internet connections an advantage. Moreover, it is also pointless to load 1
million records down to in the end 10 to remove from the database and 25 to
add.

There is however the possibility to use debdelta; Then only the changes of
the package loaded and locally patched the package. This is generally a
very good idea, but here is the whole package and replaced not only applied
the change. For small data sets, this is also OK again to delete more than
1 million records and reloaded into the database but brings huge
performance problems.

To solve this problem I have now devised the following idea:
- At regular intervals, a new version of the package X is created. (like
the “old” way)
- A package Y does not contain data, but only the control archive with the
maintainer-scripting. (Data in /tmp is also possible, but only temporary!)
Through the Fields update-package and upgrade-version in DEBIAN /
control is specified on the package, and what version the update is
provided. Version is the new version, upgrade version the old.
DPKG now leads the install the update from the maintainer scripts, it
updates the version of the package to the new version and then deletes the
data that was needed for the update.

Now it is possible to create a package Y with a script in postinst what
deletes our 10 records and adds the new 25. This (should) bring huge
performance increase. This type of update is however NOT intended for
normal packages such as software or libraries, but only for frequently
changing data sets. This approach would be compatible with the previous
dpkg because aptitude so always update the package would completely and
only the new version would download the diffs.

What do you think about this idea - it makes sense or should we implement
it rather differently?

Alex


Re: realise diff-updates with dpkg

2013-02-20 Thread Paul Wise
In what specific situation did you want to use something like this?
I'm having a hard time imagining an appropriate  use-case for this
solution.

-- 
bye,
pabs

http://wiki.debian.org/PaulWise


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/caktje6eiqj2k26gncv9fe_enoj6bfi47scbefsx+lnpbtbm...@mail.gmail.com



Re: realise diff-updates with dpkg

2013-02-20 Thread iceWave IT
I'd like use this for Antivirus-Databases, Blacklists, etc. - Anything that 
supports many updates in short time for huge datasets.

The reason for this is, that:
1. Every update would produce much bandwith, because for every update the 
complete database has to be downloaded by aptitude.
2. On every update the complete database would be updated by the maintainer 
scripts in DEBIAN/postinst and not only the affected data.

My Version would only handle the update:
- low bandwith
- less affected data to be updated

Alex

--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4da633dd-3353-42a3-aa71-2c9180e45...@gmail.com



Re: realise diff-updates with dpkg

2013-02-20 Thread Paul Wise
I asked for a specific place you want to use it, rather than some general ideas.

I don't think a generalised mechanism can work in the situations you
are thinking of, per-database mechanisms are the way to go really.

-- 
bye,
pabs

http://wiki.debian.org/PaulWise


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/caktje6ht9hjvyzemo8-zzjxnvnvcddedacxo7mzm69vcyss...@mail.gmail.com



Re: realise diff-updates with dpkg

2013-02-20 Thread iceWave IT
Ok here is the specific place:

I've got blacklists, some with over 1 million entries, so the .deb packages 
have a big size.

Debdelta doesn't function good, because so the whole list would be uninstalled 
and the new list installed. For all 2 million transactions this needs lots of 
time. And I think there's no reason for deleting 9,999 million entries to 
install them1 min later again. 
There for I'd like to write the patch in DEBIAN/postinst of the update 
package (delete entry x y and z; add entry a b and c)

--
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/90f7adb9-d133-48c0-a718-56efff7e2...@gmail.com



Re: realise diff-updates with dpkg

2013-02-20 Thread Adam Borowski
On Thu, Feb 21, 2013 at 01:52:55AM +0100, iceWave IT wrote:
 Ok here is the specific place:
 
 I've got blacklists, some with over 1 million entries, so the .deb
 packages have a big size.
 
 Debdelta doesn't function good, because so the whole list would be
 uninstalled and the new list installed.  For all 2 million transactions
 this needs lots of time.  And I think there's no reason for deleting 9,999
 million entries to install them1 min later again.

 There for I'd like to write the patch in DEBIAN/postinst of the update
 package (delete entry x y and z; add entry a b and c)

What if a node misses an update, then?

And what if it's not turned on for a month, or gets restored from an old
backup?

There's no way to tell how long back you're going to need.

I'm afraid dpkg is not a good way to ship data like this.

-- 
ᛊᚨᚾᛁᛏᚣ᛫ᛁᛊ᛫ᚠᛟᚱ᛫ᚦᛖ᛫ᚹᛖᚨᚲ


signature.asc
Description: Digital signature


Re: realise diff-updates with dpkg

2013-02-20 Thread Paul Wise
A DNSBL is the traditional solution for blacklists, why are you
putting your blacklist in a .deb?

-- 
bye,
pabs

http://wiki.debian.org/PaulWise


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/caktje6e_cvrvthgebxfte8c3-y1vh+qca_jtqhhj0qvu395...@mail.gmail.com