Re: Can compression at filesystem level improve overall performance?

2004-03-22 Thread Kris Van Bruwaene
Nikita Danilov wrote:

Redeeman writes:
 On Fri, 2004-03-19 at 15:25, Erik Terpstra wrote:
  Is it fair to say that today compression at the filesystem level would 
  improve overall performance?
 the more agressive you compress it the more cpu it takes, and that will
 make it slower, but i think a small compression algorithm for filesystem
 purpose could be written... however, i doubt it will be worth it,
 harddrives are really cheap nowadays.. but maybe some algortihm to
 compress cleartext only, or something..

That's common misconception. :)

The goal of compression is to conserve disk bandwidth rather than space.

By compressing it is possible to transfer data (== uncompressed data
user works with), at a rate higher than raw device bandwidth.
 

Something else to consider: the gain might not be so impressive, since 
many files are already heavily compressed: apart from the obvious ones 
(.zip .gz .bz2) most audio (.mp3) and video is natively compressed 
(mpeg2/4), and amy office files as well (presentations, pdf...).




Re: Can compression at filesystem level improve overall

2004-03-22 Thread Hans Reiser
The Amazing Dragon (Elliott Mitchell) wrote:

From: Sean Johnson [EMAIL PROTECTED]
On Fri, 2004-03-19 at 11:53, Nikita Danilov wrote:
   

That's common misconception. :)

The goal of compression is to conserve disk bandwidth rather than space.

By compressing it is possible to transfer data (== uncompressed data
user works with), at a rate higher than raw device bandwidth.
 

I am far from any kind of authority on filesystems, but doesn't compression
make data corruption a significantly nastier bugaboo?
   

Potentially. Depending upon the encoding losing one block of encoded data
maps to losing many blocks of decoded data. Also losing the first block
of data might make it impossible to recover later blocks.
 

I think it will just make you lose the compression atom, but Edward can 
say more when he gets back from vacation.

But these aren't issues since you do error correction near the physical
layer, and backups just you make sure. You do, don't you?
 



--
Hans


Re: 2.6.4 corruption

2004-03-22 Thread Dieter Nützel
Am Montag, 22. März 2004 01:55 schrieb Tom Vier:
 after rebooting (no umount, sysreq wouldn't respond) due to a radeon
 problem, some files had their data blocks switched. one file was truncated,
 and what was truncated (it's mounted notail, btw) was attached to another
 file. one file was completely corrupt. this isn't considered normal
 behavior, i assume.

 i'm currently bootstrapping an altroot, so i can run reiserfsck.

Use the data-logging patches for 2.6.4 up or wait or 2.6.5/2.6.6.

http://marc.theaimsgroup.com/?l=reiserfsm=107967943828838w=2
and
http://marc.theaimsgroup.com/?l=reiserfsm=107981018429164w=2

BTW I'm with DRI-Devel since the beginning and data-logging WORKS even with 
2.4.

Regards,
Dieter


Reiserfs, Best deals on CialJeis, 80 prceent off!

2004-03-22 Thread Backpacks K. Olive



Don't be like that...:)Sanely applied advertising could remake the world.
Reiserfs, searching for a source to buy medicatiOnon?Quality ViagrhOa and Cialrmis.
Fast weight (dorsicornu moas) loss and anti depressant medicatimhon!Best offers on ValiuUYm and XanaVBx.
Exceptional deals, 80 percnet off!We can send our products wlrdowideYour easy-to-use solution is here:http://www.w3e4ds.com/You are  completely anonymous!
If it fails, admit it frankly and try another. But above all, try something.There is no greater treasure than the respect and love of a true friend.



Re: reiserfsprogs: lib/misc.c: why die() aborts?

2004-03-22 Thread Vladimir Saveliev
Hi

On Sat, 2004-03-20 at 19:18, Domenico Andreoli wrote:
 hi all,
 
 in trying to figure out what is the unpack program in reiserfsprogs
 and if it supposed to be distributed in the debian package,

No, it should not be distributed

  i came to
 the function die() in lib/misc.c.
 
 i'm seeing it regularly aborts the execution instead of using the
 usual exit(EXIT_FAILURE). shouldn't it be more polite or it is not for
 public use?
 
 what is unpack supposed to do?
 
It is used for reiserfsprogs debugging

 cheers
 domenico
 
 -[ Domenico Andreoli, aka cavok
  --[ http://filibusta.crema.unimi.it/~cavok/gpgkey.asc
---[ 3A0F 2F80 F79C 678A 8936  4FEE 0677 9033 A20E BC50
 



Re: reiserfsprogs: lib/misc.c: why die() aborts?

2004-03-22 Thread Chris Mason
On Mon, 2004-03-22 at 07:49, Vladimir Saveliev wrote:
 Hi
 
 On Sat, 2004-03-20 at 19:18, Domenico Andreoli wrote:
  hi all,
  
  in trying to figure out what is the unpack program in reiserfsprogs
  and if it supposed to be distributed in the debian package,
 
 No, it should not be distributed
 
Hmmm, I disagree.  Normal users can use debugreiserfs -p and unpack to
do test rebuilds on copies of broken filesystems when they are
rebuilding critical data.

-chris




Re: new v3 2.6.4 logging/xattr patches

2004-03-22 Thread Dieter Nützel
Am Sonntag, 21. März 2004 20:26 schrieb Dieter Nützel:
 Am Sonntag, 21. März 2004 20:22 schrieb Dieter Nützel:
  Am Sonntag, 21. März 2004 17:55 schrieben Sie:
   On Sun, 2004-03-21 at 11:44, Dieter Nützel wrote:
 The suse kernel of the day ;-)  My experimental directory is a dump
 of all the suse reiserfs patches.
   
kernel-source-2.6.4-9.19.i586   (18.03.2004)
   
or
   
kernel-source-2.6.4-13.2.i586   (20.03.2004)
 
  Works out of the box.
  Only Bootsplash is missing.

 Argh, K3b isn't working, again.
 All SCSI (DVD, CD-RW, etc.).

Latest SuSE 9.0 Kernel
kernel-source-2.6.4-13.12.i586  (21.03.2004)

Linux version 2.6.4-13.12-smp ([EMAIL PROTECTED]) (gcc-Version 3.3.1 (SuSE 
Linux)) #1 SMP Mon Mar 22 14:19:29 CET 2004

Is very GOOD in all above aspects. ;-)

dual Athlon MP 1900+
1 GB DDR266, CL2 (2x 512 MB)
Fujitsu MAS3184NP, 15k RPM


Software RAID1

SunWave1 share/dbench# free -t
 total   used   free sharedbuffers cached
Mem:   1036680 405904 630776  0  53880 137472
-/+ buffers/cache: 214552 822128
Swap:  2103288  02103288
Total: 3139968 4059042734064

SunWave1 share/dbench# time dbench 32
32 clients started

Throughput 118.808 MB/sec (NB=148.51 MB/sec  1188.08 MBit/sec)  32 procs
8.250u 35.218s 0:35.55 122.2%   0+0k 0+0io 0pf+0w

SunWave1 share/dbench# free -t
 total   used   free sharedbuffers cached
Mem:   1036680 594592 442088  0  52348  96232
-/+ buffers/cache: 446012 590668
Swap:  2103288  02103288
Total: 3139968 5945922545376

Max load was ~15.

--
Second run:

Throughput 131.63 MB/sec (NB=164.537 MB/sec  1316.3 MBit/sec)  32 procs
8.347u 36.151s 0:32.09 138.6%   0+0k 0+0io 0pf+0w

SunWave1 share/dbench# free -t
 total   used   free sharedbuffers cached
Mem:   1036680 661580 375100  0  49748  72040
-/+ buffers/cache: 539792 496888
Swap:  2103288  02103288
Total: 3139968 6615802478388

Max load was ~18.

**

Software RAID0

SunWave1 SOURCE/dbench# free -t
 total   used   free sharedbuffers cached
Mem:   1036680 665544 371136  0  51852  76056
-/+ buffers/cache: 537636 499044
Swap:  2103288  02103288
Total: 3139968 6655442474424

SunWave1 SOURCE/dbench# time dbench 32

Throughput 190.978 MB/sec (NB=238.723 MB/sec  1909.78 MBit/sec)  32 procs
8.204u 32.067s 0:22.12 182.0%   0+0k 0+0io 0pf+0w

SunWave1 SOURCE/dbench# free -t
 total   used   free sharedbuffers cached
Mem:   1036680 437108 599572  0  59028  71192
-/+ buffers/cache: 306888 729792
Swap:  2103288  02103288
Total: 3139968 4371082702860

Max load was ~9.

--

SunWave1 SOURCE/dbench# free -t
 total   used   free sharedbuffers cached
Mem:   1036680 437280 599400  0  59460  71032
-/+ buffers/cache: 306788 729892
Swap:  2103288  02103288
Total: 3139968 4372802702688
SunWave1 SOURCE/dbench# time dbench 32

Throughput 184.295 MB/sec (NB=230.369 MB/sec  1842.95 MBit/sec)  32 procs
8.115u 32.166s 0:22.92 175.6%   0+0k 0+0io 0pf+0w

SunWave1 SOURCE/dbench# free -t
 total   used   free sharedbuffers cached
Mem:   1036680 523316 513364  0  78532  67260
-/+ buffers/cache: 377524 659156
Swap:  2103288  02103288
Total: 3139968 5233162616652

Max load was ~10.


Greetings,
Dieter


Re: reiserfsprogs: lib/misc.c: why die() aborts?

2004-03-22 Thread Vladimir Saveliev
Hello

On Mon, 2004-03-22 at 16:26, Chris Mason wrote:
 On Mon, 2004-03-22 at 07:49, Vladimir Saveliev wrote:
  Hi
  
  On Sat, 2004-03-20 at 19:18, Domenico Andreoli wrote:
   hi all,
   
   in trying to figure out what is the unpack program in reiserfsprogs
   and if it supposed to be distributed in the debian package,
  
  No, it should not be distributed
  
 Hmmm, I disagree.  Normal users can use debugreiserfs -p and unpack to
 do test rebuilds on copies of broken filesystems when they are
 rebuilding critical data.
 

Ok, but then we should rename it to better name. reiserfs_unpack, for
example

 -chris
 
 
 



Re: reiserfsprogs: lib/misc.c: why die() aborts?

2004-03-22 Thread Hans Reiser
Vladimir Saveliev wrote:

Hello

On Mon, 2004-03-22 at 16:26, Chris Mason wrote:
 

On Mon, 2004-03-22 at 07:49, Vladimir Saveliev wrote:
   

Hi

On Sat, 2004-03-20 at 19:18, Domenico Andreoli wrote:
 

hi all,

in trying to figure out what is the unpack program in reiserfsprogs
and if it supposed to be distributed in the debian package,
   

No, it should not be distributed

 

Hmmm, I disagree.  Normal users can use debugreiserfs -p and unpack to
do test rebuilds on copies of broken filesystems when they are
rebuilding critical data.
   

Ok, but then we should rename it to better name. reiserfs_unpack, for
example
 

or reiserfs_unpack_metadata_bundle

 

-chris



   



 



--
Hans


Re: reiserfsprogs: lib/misc.c: why die() aborts?

2004-03-22 Thread Vitaly Fertman
On Monday 22 March 2004 16:45, Vladimir Saveliev wrote:
 Hello

 On Mon, 2004-03-22 at 16:26, Chris Mason wrote:
  On Mon, 2004-03-22 at 07:49, Vladimir Saveliev wrote:
   Hi
  
   On Sat, 2004-03-20 at 19:18, Domenico Andreoli wrote:
hi all,
   
in trying to figure out what is the unpack program in reiserfsprogs
and if it supposed to be distributed in the debian package,
  
   No, it should not be distributed
 
  Hmmm, I disagree.  Normal users can use debugreiserfs -p and unpack to
  do test rebuilds on copies of broken filesystems when they are
  rebuilding critical data.

 Ok, but then we should rename it to better name. reiserfs_unpack, for
 example

ok, I will add the option 'unpack' in debugreiserfs instead of 
the unpack program.

-- 
Thanks,
Vitaly Fertman


Re: ReConfigurable Directory Structure Agrregation of files according to semantic.

2004-03-22 Thread Valdis . Kletnieks
On Tue, 16 Mar 2004 23:10:04 EST, Hubert Chan [EMAIL PROTECTED]  said:
 And document files too.  I'm looking forward to being able to being able
 to scrap this strange hierarchy system that I'm currently using for all
 my documents.  Email, too, would do well with this system.  Just toss
 all the mail in a single folder, and have your MUA query the filesystem
 for mails from the ReiserFS list, or mails from friends, etc.

Ad-hoc query support in the file system (or even in user space) is always a
problematic issue, because there's so many corner cases that result in a DWIM
interface problem.

For example:

If you query your music filesystem for Eric Clapton, should it return While My
Guitar Gently Weeps by the Beatles?  If you ask it for songs written by the
artist Prince, should it return Manic Monday by the Bangles (the album
credit says Christopher)?  The music industry is *full* of that - and queries
like that Just Don't Work unless your metadata is accurate.

Bonus points for being able to handle music by Metallica before they heaved
Mustaine overboard and he went off to make Megadeth - what year did he leave,
and are the songs all *accurately* tagged for release dates?

If some idiot in Zanzibar says ooh shiny and clicks on an attachment they
shouldn't have, and starts spewing mail to you that has a friend's address in
the From: field, should mail from friends find it?

For that matter, how does my MUA know who friends are?  I have some people
that would count as friends who I correspond with on a much lower frequency
than some idiots that I'd rather never hear from again (but have to deal with
due to various obligations).  Equally problematic is when an old college
classmate drops me a note asking about our supercomputer, as an off-list reply
to something I said on a security mailing list (actually happened recently).
Is that a security, or friends, or supercomputing, or VT News, or all/
other?  And how does it know, other than simple word-indexing schemes (I
already use 'glimpse', but even that gets painful when your e-mail archive goes
back 15 years and totals over a gigabyte - compound searches take *forever*.

Semantic analysis is a royal pain - I can't expect the computer to be able to
figure out meanings in order to classify them, when *I* can't do it (I have at
least 10 or 15 pieces of mail that require a reply, but I haven't figured out
yet what the fleep the author was talking about..)



pgp0.pgp
Description: PGP signature


Quit your job and work on ebay

2004-03-22 Thread Owen Hardin
Title: Turnkey



purchasable ludlow canon lipscomb connector citrate embrace involute cecilia ratio beret ell progress forgettable wright becker fig decline abram army fiance munch more christian elves cunning polite isolate doge mundane admiration epithelium stress harem chemistry tog 




















fake vagrant ammonium diane again nh breathtaking dovetail gallivant atmospheric blasphemy doctor barnes hydrochemistry donnybrook backspace wrapup odd ooze sacrosanct taurus your infringe newsstand tsunami offsaddle coy briar cotillion bonaventure censor occupy gunfire blatant beam freemen prospectus d'etat clinton splice but tacky baseman ancestral ncr uttermost abigail albacore slivery timepiece cent sousa dangerous nasa boatswain squashy alabama dixon badland selkirk hawaiian wig neglecter humidistat disturbance headsmen tanzania polis appian clapeyron bracken bryce ursa baste californium rickshaw mushy genius steal jess congresswoman abstinent carve ciliate kathy hate hiatus distant cup sigma enliven alpenstock fluorescein crankcase immigrant keyes daddy every irreproducible stood haphazard won't cation ox manservant aplomb aviate hispanic saxon kelley inexpedient til dysentery psalm aver absolution slang compartment briny terminus client wyeth bonfire jackass habeas inhomogeneity alva frenetic cue antithetic perception liaison bangor paraboloidal rothschild paradox angelica anxious coroutine average stunk thermostat hahn ferrous restrict tallyho antoinette derange alcmena insoluble birch brakeman alive corral baden burdensome amble acorn tavern holocene loquacity buddhist wary sentential pharmacology deconvolution xerxes visa fusty abandon incest jounce prompt bonaparte bizarre sorrowful persecutory flathead nouakchott quantile benelux ash marks sepulchral stag backplate 




Learn to Make A For'tune With E'bay!
Com'plete Tur'nkey Sy'stem
So'ftware - Vi'deos - Tut'orials
Visit Here For Information


removemeplease





Re: Can compression at filesystem level improve overall performance?

2004-03-22 Thread Scott Young

 
 That's common misconception. :)
 
 The goal of compression is to conserve disk bandwidth rather than space.
 
 By compressing it is possible to transfer data (== uncompressed data
 user works with), at a rate higher than raw device bandwidth.

I will be doing some research on an algorithm that speeds up data
transfers over a network by adaptively selecting a compression
algorithm.  It can be applied to filesystem reads and writes too.  When
the send queue is reasonably full on the server, it starts compressing
data at the tail of the queue while sending the data at the head of the
queue.  If the output stream catches up to segment currently being
compressed, then that segment is sent uncompressed.  If the compressed
data is not significantly smaller, then the uncompressed data is sent
instead.  For network applications that are not network interface bound
(like rsync over a 100mbit connection), the buffer will be empty most of
the time and therefore little compression would be needed or wanted as
it would only slow the application down.  Compression is chosen from a
pool of algorithms and varied depending on the history of buffer
overflows and under-runs.  Slower, better compression algorithms are
used when the buffer is mostly full and the compression is observably
effective.  The idea here is to minimize the time between the client
requesting the data and having the usable data in a minimal amount of
time.  This can be seen as a time-verses-amount-of-usable-data-on-client
graph, and some applications prefer a low latency for the initial stream
of data (such as a web page) whereas some prefer the time to retrieve a
very large piece of data (such as scp [EMAIL PROTECTED]/SomeBigDocument.sxw
/home/scott over a 56k modem).

Adapting this to filesystem concepts, the server can be seen as the
write process and the client can be seen as the read process.  The idea
can be applied to Reiser4 by compressing the overwrite set while the
journal data is being written, and then compressing the tail of the
relocate set moving backwards until the write stream catches up to the
compression.  It could also take into account the estimated
decompression time when reading the data back, and use it for deciding
whether the compression ratio is good enough to write the compressed
data instead of the uncompressed data.

Another interesting twist would be to cache the compressed data if the
same data is going to be sent from the server several times.  This
reduces CPU overhead on the server (and possibly it's memory
requirements for caching the data, and reduces the amount of data that
needs to be read from the drive), but it is complicated in the context
of a network algorithm and is mostly application-dependent.  This is
research for another day, maybe in the form of a derived-data plugin for
ReiserFS where an application tells the filesystem how to construct the
file, and the filesystem can store the original, the result, or both,
depending on space needs and performance analysis, with copy-on-write
metadata flags when appropriate.

I haven't started coding the adaptive compression algorithm yet, but I
have a general idea about how I am going to implement it.  For the
proof-of-concept, I want to write this using sockets and some basic
library compression algorithms (gzip, bzip2, and maybe a simple MTF +
Adaptive Huffman).  Later variants may work with TCP or other protocols
around that layer.  Any suggestions will be appreciated.


Scott Young






Eu Need V_i_c_0_d_i_n For Pain?

2004-03-22 Thread dee madison




voorgestel defindex sabme soklyon
pulling image from server

 
 V'i;cod/din avaliable


n'ho
 moyre
Little Johnny came home early from school and started calling his mother with no answer. He finally went up stairs and saw the bedroom door was open a little. When he peered in, he saw his dad on the bed with the maid so he quietly went outside and waited for his mother. When she showed up with some groceries, he said "Mommy, Mommy guess what I saw? I saw daddy upstairs on the bed with the maid and they were..." and his Mother said, "Stop right there, Johnny". Wait until supper tonight when the maid is serving the meal. When I wink at you, then tell me the story." At supper when all were seated and being served by the maid, she winked and Johnny began again. "Mommy, When I got home from school early today, I was looking for you and saw daddy on the bed with the maid. They were doing the same thing that I saw you and Uncle Phil doing at the cottage last summer."
A police officer pulls over this guy who's been weaving in and out of the lanes. He goes up to the guy's window and says, "Sir, I need you to blow into this breathalyzer tube." The man says, "Sorry, officer, I can't do that. I am an asthmatic. If I do that, I'll have a really bad asthma attack." "Okay, fine. I need you to come down to the station to give a blood sample." "I can't do that either. I am a hemophiliac. If I do that, I'll bleed to death." "Well, then, we need a urine sample." "I'm sorry, officer, I can't do that either. I am also a diabetic. If I do that, I'll get really low blood sugar." "All right, then I need you to come out here and walk this white line." "I can't do that, officer." "Why not?" "Because I'm drunk."
A large two engine train was crossing America. After they had gone some distance one of the engines broke down."No problem," the engineer thought, and carried on at half-power. Further on down the line, the other engine broke down, and the train came to a standstill. The engineer decided he should inform the passengers about why the train had stopped, and made the following announcement:"Ladies and gentlemen, I have some good news and some bad news. The bad news is that both engines have failed, and we will be stuck here for some time. The good news is that this is a train and not a plane."
miyoshi0kamihito63maruchip,jisiki himitsuj.



Profit huge with eBay

2004-03-22 Thread Belinda Mccall
Title: Turnkey



pomegranate motherland tablecloth guignol them barnstorm greenberg debauch penetrable boathouse kuhn committee spectrum aver cochrane contractual driven coercive allstate rebuke debit ferroelectric beverage converge coauthor sandwich 




















gladiolus seton transceiver spitfire zigging although determine brass physiochemical richardson dodd inclusive billion inelastic criminal vascular symbiotic mathematician o'clock inbreed wheeze gauche standby pantomime perk hereof mccarty happenstance hipster alexandre protestant slut creepy caldera cottage percentile sadler aurelius avocet sulfanilamide damon inc sidecar speck embalm aile kingfisher rotund swanlike holocaust iberia metalloid anglophobia prom notice away chub gelatin chang presto mimi convivial feint richards actinium darwin advent annulled elect accordion bivalve cascara definite bomb journal contingent arrogant cutesy brakeman darpa bali brenner nelsen gerber chaucer tutor hewitt prescript pursuit bourgeoisie airline arrest reversion lattice ectoderm brigadier ray baroness cranberry bullhead decompile buckhorn monochromatic syndrome january parson opinion inequality eloquent norfolk ripley sabina employer medicine jerk sword sawtooth calorie chaw congressman mansfield elucidate breton bore zero sophocles aid turtleback exposure dissociable dichotomize comb vein univariate desperate catalysis lounge crosswalk bewitch dialect sapient squirrel jr bar vulpine vernal patti inauspicious dinnerware dunlap cattleman diploma miscellany marlboro radioactive 




Learn to Make A For'tune With E'bay!
Com'plete Tur'nkey Sy'stem
So'ftware - Vi'deos - Tut'orials
Visit Here For Information


removemeplease





Re: Can compression at filesystem level improve overall performance?

2004-03-22 Thread Hans Reiser
Scott Young wrote:

That's common misconception. :)

The goal of compression is to conserve disk bandwidth rather than space.

By compressing it is possible to transfer data (== uncompressed data
user works with), at a rate higher than raw device bandwidth.
   

I will be doing some research on an algorithm that speeds up data
transfers over a network by adaptively selecting a compression
algorithm.  It can be applied to filesystem reads and writes too.  When
the send queue is reasonably full on the server, it starts compressing
data at the tail of the queue while sending the data at the head of the
queue.  If the output stream catches up to segment currently being
compressed, then that segment is sent uncompressed.  If the compressed
data is not significantly smaller, then the uncompressed data is sent
instead.  For network applications that are not network interface bound
(like rsync over a 100mbit connection), the buffer will be empty most of
the time and therefore little compression would be needed or wanted as
it would only slow the application down.  Compression is chosen from a
pool of algorithms and varied depending on the history of buffer
overflows and under-runs.  Slower, better compression algorithms are
used when the buffer is mostly full and the compression is observably
effective.  The idea here is to minimize the time between the client
requesting the data and having the usable data in a minimal amount of
time.  This can be seen as a time-verses-amount-of-usable-data-on-client
graph, and some applications prefer a low latency for the initial stream
of data (such as a web page) whereas some prefer the time to retrieve a
very large piece of data (such as scp [EMAIL PROTECTED]/SomeBigDocument.sxw
/home/scott over a 56k modem).
Adapting this to filesystem concepts, the server can be seen as the
write process and the client can be seen as the read process. 

I don't understand.  Why not view the client as the disk drive and the 
bus as the network?

The idea
can be applied to Reiser4 by compressing the overwrite set while the
journal data is being written, and then compressing the tail of the
relocate set moving backwards until the write stream catches up to the
compression.  It could also take into account the estimated
decompression time when reading the data back, and use it for deciding
whether the compression ratio is good enough to write the compressed
data instead of the uncompressed data.
 

I didn't understand the above.

Another interesting twist would be to cache the compressed data if the
same data is going to be sent from the server several times.  This
reduces CPU overhead on the server (and possibly it's memory
requirements for caching the data, and reduces the amount of data that
needs to be read from the drive), but it is complicated in the context
of a network algorithm and is mostly application-dependent.  This is
research for another day, maybe in the form of a derived-data plugin for
ReiserFS where an application tells the filesystem how to construct the
file, and the filesystem can store the original, the result, or both,
depending on space needs and performance analysis, with copy-on-write
metadata flags when appropriate.
 

I didn't understand the above.

I haven't started coding the adaptive compression algorithm yet, but I
have a general idea about how I am going to implement it.  For the
proof-of-concept, I want to write this using sockets and some basic
library compression algorithms (gzip, bzip2, and maybe a simple MTF +
Adaptive Huffman).  Later variants may work with TCP or other protocols
around that layer.  Any suggestions will be appreciated.
 

I think we need to use adaptive compression in Reiser4, based on the 
type of file being compressed,  and anyone who finds it interesting to 
develop heuristics for selecting compression strategies is welcome to 
help and join the fun.

Scott Young





 



--
Hans


Re: ReConfigurable Directory Structure Agrregation of files according to semantic.

2004-03-22 Thread Hubert Chan
 Valdis == Valdis Kletnieks [EMAIL PROTECTED] writes:

Valdis On Tue, 16 Mar 2004 23:10:04 EST, Hubert Chan [EMAIL PROTECTED]
Valdis said:
Hubert And document files too.  I'm looking forward to being able to
Hubert being able to scrap this strange hierarchy system that I'm
Hubert currently using for all my documents.  Email, too, would do well
Hubert with this system.  Just toss all the mail in a single folder,
Hubert and have your MUA query the filesystem for mails from the
Hubert ReiserFS list, or mails from friends, etc.

Valdis Ad-hoc query support in the file system (or even in user space)
Valdis is always a problematic issue, because there's so many corner
Valdis cases that result in a DWIM interface problem.

Yes.  When I wrote what I wrote, I was assuming that you have good
metadata that the filesystem can use.  I assume that semantic analysis,
DWIM, mind reading, etc., is out of the scope of the filesystem layer,
but anyone is free to implement a plugin that does what they want.

So when I say mails from friends, I'm skipping most of the details,
which I think most people don't care about.  I'm really saying that I
would give the filesystem a list of email addresses of people that I
consider friends, and get the filesystem to find all mails from those
addresses.  If someone from Zanzibar starts flooding me with the
virus-du-jour, then I refine my filter so that it excludes mails that
clamav detects viruses in.  (Actually, I would just configure clamav to
delete all mails that contain viruses.)  If my old college friend drops
me a mail, and I decide I want his mails to show up in my friends
query, then I add his email address to the list.  (Or add him to my
address book, mark him as friend, and let the filesystem do the join.)

(Actually, I don't think I personally would use a mails from friends
query.  I would generally just use a mails to/from person x.)

[...]

Valdis And how does it know, other than simple word-indexing schemes (I
Valdis already use 'glimpse', but even that gets painful when your
Valdis e-mail archive goes back 15 years and totals over a gigabyte -
Valdis compound searches take *forever*.

Well, Reiser4(? - with the appropriate plugin?) will let you add
arbitrary attributes through the everything-is-a-file-and-a-directory
mechanism, so you can add, for example, a tags attribute.  For all the
mails dealing with supercomputing, you add the supercomputing tag.  For
all security mails (that aren't already handled by your
security-mailing-list, etc. filters), you can add the security tag.
Then you tell the filesystem that the tags attribute is a handy thing
to index.  Again, I'm assuming that the user provides the filesystem
with useful metadata.  I don't assume that the filesystem can read your
mails and do automatic classification with 100% accuracy.

-- 
Hubert Chan [EMAIL PROTECTED] - http://www.uhoreg.ca/
PGP/GnuPG key: 1024D/124B61FA
Fingerprint: 96C5 012F 5F74 A5F7 1FF7  5291 AF29 C719 124B 61FA
Key available at wwwkeys.pgp.net.   Encrypted e-mail preferred.