Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-13 Thread JohnS
On Mon, 2009-07-13 at 05:49 +, o wrote: It is 1024 chars long. Witch want still help. I'm usng mysam and according to: http://dev.mysql.com/doc/refman/5.1/en/myisam-storage-engine.html The maximum key length is 1000 bytes. This can also be changed by changing

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-13 Thread Les Mikesell
JohnS wrote: On Mon, 2009-07-13 at 05:49 +, o wrote: It is 1024 chars long. Witch want still help. I'm usng mysam and according to: http://dev.mysql.com/doc/refman/5.1/en/myisam-storage-engine.html The maximum key length is 1000 bytes. This can also be changed

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-12 Thread oooooooooooo ooooooooooooo
How many files per directory do you have? I have 4 directory levels, 65536 leaves directories and around 200 files per dir (15M in total)- Something is wrong. Got to figure this out. Where did this RAM go? Thanks I reduced the memory usage of mysql and my app it and I got around a 15%

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-11 Thread Alexander Georgiev
2009/7/11 o hhh...@hotmail.com: You mentioned that the data can be retrieved from somewhere else. Is some part of this filename a unique key? The real key is up to 1023 chracters long and it's unique, but I have to trim to 256 charactes, by this way is not unique

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-11 Thread oooooooooooo ooooooooooooo
Thanks, using directories as file names is a great idea, anyway I'm not sure if that would solve my performance issue, as the bottleneck is the disk and not mysql. I just implemented the directories names based on the hash of the file and the performance is a bit slower than before. This is

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-11 Thread Alexander Georgiev
Thanks, using directories as file names is a great idea, anyway I'm not sure if that would solve my performance issue, as the bottleneck is the disk and not mysql. The situation you described initally, suffers from only one issue - too many files in one single directory. You are not the

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-11 Thread JohnS
On Sat, 2009-07-11 at 00:01 +, o wrote: You mentioned that the data can be retrieved from somewhere else. Is some part of this filename a unique key? The real key is up to 1023 chracters long and it's unique, but I have to trim to 256 charactes, by this way

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-11 Thread JohnS
On Sat, 2009-07-11 at 11:48 -0400, JohnS wrote: On Sat, 2009-07-11 at 00:01 +, o wrote: You mentioned that the data can be retrieved from somewhere else. Is some part of this filename a unique key? The real key is up to 1023 chracters long and it's

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread oooooooooooo ooooooooooooo
Hi, After talking with te customer, I finnaly managed to convince him for using the first characters of the hash as directory names. Now I'm in doubt about the following options: a) Using directory 4 levels /c/2/a/4/ (200 files per directory) and mysql with a hash-filename table, so I can get

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Les Mikesell
o wrote: Hi, After talking with te customer, I finnaly managed to convince him for using the first characters of the hash as directory names. Now I'm in doubt about the following options: a) Using directory 4 levels /c/2/a/4/ (200 files per directory) and mysql

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Alexander Georgiev
My original idea was using the just the hash as filename, by this way I could have a direct access. But the customer rejected this and requested to have part of the long file name (from 11 to 1023 characters). As linux only allows 256 characters in the path and I could get duplicates with the

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread oooooooooooo ooooooooooooo
Ok, I coudl use mysql, but think we have around 15M entries and I would have to add to each a file from 1KB to 150KB, in total the files size can be around 200GB. How will be the performance of this in mysql? _ Discover the new

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Alexander Georgiev
2009/7/10, o hhh...@hotmail.com: Ok, I coudl use mysql, but think we have around 15M entries and I would have to add to each a file from 1KB to 150KB, in total the files size can be around 200GB. How will be the performance of this in mysql? in the worst case - 150kb

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Filipe Brandenburger
On Fri, Jul 10, 2009 at 16:21, Alexander Georgievalexander.georg...@gmail.com wrote: I would use either only a database, or only the file system. To me - using them both is a violation of KISS. I disagree with your general statement. Storing content that is appropriate for files (e.g.,

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread oooooooooooo ooooooooooooo
I don't think you've explained the constraint that would make you use mysql or not. My original idea was using the just the hash as filename, by this way I could have a direct access. But the customer rejected this and requested to have part of the long file name (from 11 to 1023 characters).

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Alexander Georgiev
2009/7/10, Filipe Brandenburger filbran...@gmail.com: On Fri, Jul 10, 2009 at 16:21, Alexander Georgievalexander.georg...@gmail.com wrote: I would use either only a database, or only the file system. To me - using them both is a violation of KISS. I disagree with your general statement.

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread oooooooooooo ooooooooooooo
According to my tests the average size per file is around 15KB (although there are files from 1Kb to 150KB). _ Explore the seven wonders of the world http://search.msn.com/results.aspx?q=7+wonders+worldmkt=en-USform=QBRE

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Les Mikesell
o wrote: I don't think you've explained the constraint that would make you use mysql or not. My original idea was using the just the hash as filename, by this way I could have a direct access. But the customer rejected this and requested to have part of the long

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread oooooooooooo ooooooooooooo
You mentioned that the data can be retrieved from somewhere else. Is some part of this filename a unique key? The real key is up to 1023 chracters long and it's unique, but I have to trim to 256 charactes, by this way is not unique unless I add the hash. Do you have to track this

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-09 Thread oooooooooooo ooooooooooooo
There's C code to do this in squid, and backuppc does it in perl (for a pool directory where all identical files are hardlinked). Unfortunately I have to write the file with some predefined format, so these would not provide the flexibility I need. Rethink how you're writing files or you'll

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-09 Thread JohnS
On Wed, 2009-07-08 at 16:14 -0600, Frank Cox wrote: On Wed, 08 Jul 2009 18:09:28 -0400 Filipe Brandenburger wrote: You can hash it and still keep the original filename, and you don't even need a MySQL database to do lookups. Now that is slick as all get-out. I'm really impressed your

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-09 Thread James A. Peltier
On Thu, 9 Jul 2009, o wrote: It's possible that I will be able to name the directory tree based in the hash of te file, so I would get the structure described in one of my previous post (4 directory levels, each directory name would be a single character from 0-9

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-09 Thread JohnS
On Thu, 2009-07-09 at 10:09 -0700, James A. Peltier wrote: On Thu, 9 Jul 2009, o wrote: It's possible that I will be able to name the directory tree based in the hash of te file, so I would get the structure described in one of my previous post (4 directory

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-09 Thread James A. Peltier
On a side note, perhaps this is something that Hadoop would be good with. -- James A. Peltier Systems Analyst (FASNet), VIVARIUM Technical Director HPC Coordinator Simon Fraser University - Burnaby Campus Phone : 778-782-6573 Fax : 778-782-3045 E-Mail : jpelt...@sfu.ca Website :

[CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread oooooooooooo ooooooooooooo
Hi, I have a program that writes lots of files to a directory tree (around 15 Million fo files), and a node can have up to 40 files (and I don't have any way to split this ammount in smaller ones). As the number of files grows, my application gets slower and slower (the app is works

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Niki Kovacs
o a écrit : Hi, I have a program that writes lots of files to a directory tree Did that program also write your address header ? :o) ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Per Qvindesland
Perhaps think about running tune2fs maybe also consider adding noatime Regards Per E-mail: p...@norhex.com [1] http://www.linkedin.com/in/perqvindesland [2] --- Original message follows --- SUBJECT: Re: [CentOS] Question about optimal filesystem with many small files. FROM:  Niki Kovacs

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Les Mikesell
o wrote: Hi, I have a program that writes lots of files to a directory tree (around 15 Million fo files), and a node can have up to 40 files (and I don't have any way to split this ammount in smaller ones). As the number of files grows, my application gets

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Kwan Lowe
On Wed, Jul 8, 2009 at 2:27 AM, o hhh...@hotmail.com wrote: Hi, I have a program that writes lots of files to a directory tree (around 15 Million fo files), and a node can have up to 40 files (and I don't have any way to split this ammount in smaller ones). As

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Gary Greene
On 7/8/09 8:56 AM, Les Mikesell lesmikes...@gmail.com wrote: o wrote: Hi, I have a program that writes lots of files to a directory tree (around 15 Million fo files), and a node can have up to 40 files (and I don't have any way to split this ammount in smaller

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread oooooooooooo ooooooooooooo
for the help. From: hhh...@hotmail.com To: centos@centos.org Date: Wed, 8 Jul 2009 06:27:40 + Subject: [CentOS] Question about optimal filesystem with many small files. Hi, I have a program that writes lots of files to a directory tree (around 15

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread oooooooooooo ooooooooooooo
tree, what gives the issue. Did that program also write your address header ? :) Thanks for the help. From: hhh...@hotmail.com To: centos@centos.org Date: Wed, 8 Jul 2009 06:27:40 + Subject: [CentOS] Question about optimal filesystem

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Filipe Brandenburger
Hi, On Wed, Jul 8, 2009 at 17:59, ohhh...@hotmail.com wrote: My original idea was storing the file with a hash of it name, and then store a  hash-real filename in mysql. By this way I have direct access to the file and I can make a directory hierachy with the first

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Frank Cox
On Wed, 08 Jul 2009 18:09:28 -0400 Filipe Brandenburger wrote: You can hash it and still keep the original filename, and you don't even need a MySQL database to do lookups. Now that is slick as all get-out. I'm really impressed your scheme, though I don't actually have any use for it right at

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread oooooooooooo ooooooooooooo
You can hash it and still keep the original filename, and you don't even need a MySQL database to do lookups. There are an issue I forgot to mention: the original file name can be up top 1023 characters long. As linux only allows 256 characters in the file path, I could have a (very small)

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Les Mikesell
o wrote: You can hash it and still keep the original filename, and you don't even need a MySQL database to do lookups. There are an issue I forgot to mention: the original file name can be up top 1023 characters long. As linux only allows 256 characters in the file

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread James A. Peltier
On Wed, 8 Jul 2009, o wrote: Hi, I have a program that writes lots of files to a directory tree (around 15 Million fo files), and a node can have up to 40 files (and I don't have any way to split this ammount in smaller ones). As the number of files grows,

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread James A. Peltier
On Wed, 8 Jul 2009, o wrote: Hi, I have a program that writes lots of files to a directory tree (around 15 Million fo files), and a node can have up to 40 files (and I don't have any way to split this ammount in smaller ones). As the number of files grows,

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread nate
James A. Peltier wrote: There isn't a good file system for this type of thing. filesystems with many very small files are always slow. Ext3, XFS, JFS are all terrible for this type of thing. I can think of one...though you'll pay out the ass for it, the Silicon file system from BlueArc

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Alexander Georgiev
2009/7/9, o hhh...@hotmail.com: After a quick calculation, that could put around 3200 files per directory (I have around 15 million of files), I think that above 1000 files the performance will start to degrade significantly, anyway it would be a mater of doing some