That was my original idea - and to take things to extremes, you might 
even want to recurse directories: using only a couple of recursiveness 
levels (1-1000/1-1000/1-1000.jpg) would ensure 1 bil. files without 
getting more than 1000 in any singe directory. The really nice thing 
about this approach is that you can start with a single level and in the 
unlikely event you get more than 1 mil. (you wish! :)) then you can add 
another in no time, with a down time of probably ten minutes.

In a slightly different train of thoughts, since this approach is so 
flexible, I'd recommend going for 500 files per directory - that's a 
limit which guarantees no slowdown. (And yes, I've read Jay's argument, 
but I have heard so many counter-arguments throughout time that I 
personally prefer to stay on the safe side - it's no big deal after all).

Just my 2c, obviously.

Bogdan

Justin French wrote:
> So hash them into groups of 1000 images????
> 
> 1-1000
> 1001 - 2000
> 2001 - 3000
> 
> etc etc, and just name the files 1.gif, 2.jpeg, etc etc
> 
> Justin French
> 
> 
> 
> on 22/08/02 12:41 AM, Scott Houseman ([EMAIL PROTECTED]) wrote:
> 
> 
>>Hi there.
>>
>>So what you are suggesting is using an AUTO_INCREMENT field, possibly the
>>image's Primary Key as an identifier
>>for that image file., which is fine by me, but surely one should store files
>>across directories, as 10000 images
>>in a single directory might slow down access to those images in the
>>filesystem, or not so?
>>
>>Thanks for your input.
>>
>>Regards
>>
>>-Scott
>>
>>
>>>-----Original Message-----
>>>From: DL Neil [mailto:[EMAIL PROTECTED]]
>>>Sent: 21 August 2002 04:31
>>>To: [EMAIL PROTECTED]; Bogdan Stancescu
>>>Subject: Re: [PHP] Image library
>>>
>>>
>>>Scott (confirming Bogdan),
>>>
>>>Libraries of all types have had this concern for years - even though books
>>>are uniquely identified by ISBN, that is still not good enough for library
>>>purposes (eg multiple copies of a single title). So they, exactly
>>>as Bogdan
>>>suggests, use an "Accession" number sequence - which can be
>>>implemented very
>>>neatly in MySQL (from PHP) as an AUTO_INCREMENT field.
>>>
>>>Regards,
>>>=dn
>>>
>>>
>>>>I've seen this kind of random approach several times and I keep
>>>>wondering why not counting the files instead. Yes, it may take a little
>>>>longer when uploading but I personally think the safety of the approach
>>>>is worth the insignificant speed sacrifice.
>>>>
>>>>Bogdan
>>>>
>>>>Scott Houseman wrote:
>>>>
>>>>>Hi all.
>>>>>
>>>>>This confirms what I suspected.
>>>>>
>>>>>The hash algrithm:
>>>>>
>>>>>I have a directory structure: dirs 0 - f, and within each of
>>>>
>>>these, the
>>>same
>>>
>>>>>dir structure 0 - f.
>>>>>When an image gets uploaded into the library, do an md5sum of
>>>>
>>>the file,
>>>take
>>>
>>>>>the first 2 chars of that hash
>>>>>and there's your path. e.g
>>>>>$PICDBPATH.'/a/7/a7b8be10b0e69fe3decaa538f1febe84'
>>>>>
>>>>>I'm not sure what the mathematical randomness of this is, but I'm sure
>>>>
>>>it's
>>>
>>>>>pretty random, and the chances
>>>>>of collision should be virtually null, the only time you should
>>>>
>>>overwrite a
>>>
>>>>>file is if you upload the exact same file(?)
>>>>>Is there a better way of doing this?
>>>>>
>>>>>Cheers
>>>>>
>>>>>-Scott
>>>>>
>>>>>
>>>>>
>>>>>>-----Original Message-----
>>>>>>From: Justin French [mailto:[EMAIL PROTECTED]]
>>>>>>Sent: 21 August 2002 03:25
>>>>>>To: [EMAIL PROTECTED]; PHP General
>>>>>>Subject: Re: [PHP] Image library
>>>>>>
>>>>>>
>>>>>>on 21/08/02 9:45 PM, Scott Houseman ([EMAIL PROTECTED]) wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>Which way would be the most efficient/fastest to access images
>>>>>>
>>>>>>from an image
>>>>>>
>>>>>>
>>>>>>>library.
>>>>>>>A) Store image files in a hash directory structure AND storing
>>>>>>
>>>>>>each file's
>>>>>>
>>>>>>
>>>>>>>information in a mysql table
>>>>>>>OR
>>>>>>>B) Storing image information in mysql table AND storing the
>>>>>>
>>>>>>image in a BLOB
>>>>>>
>>>>>>
>>>>>>>field in that table.
>>>>>>
>>>>>>>From all accounts I've read on this list, a database is not
>>>>>>
>>>>>>usually faster
>>>>>>than a filesystem.  And for large amounts of files, like 1000's,
>>>>>>a hash will
>>>>>>speed it up more.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>The way I see it, considerations to be taken into acount:
>>>>>>>- Is it quicker/better to retrieve image from table & then stream out
>>>>>>
>>>to
>>>
>>>>>>>browser OR simply direct the browser to the file?
>>>>>>>i.e <IMG SRC="/imagelib/image.php?iImageID=10"> OR <IMG
>>>>>>>SRC="/imagelib/5/f/10">
>>>>>>>- Will a database OR filesystem be more scalable i.e. which
>>>>>>
>>>wil perform
>>>
>>>>>>>better when there are 10000 images in the libary?
>>>>>>
>>>>>>Filesystem should be quicker.  You need to think about how
>>>>>
>>>you hash the
>>>
>>>>>>files up for the most even spread of files in each directory I guess.
>>>>>>
>>>>>>
>>>>>>Justin
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>--
>>>>PHP General Mailing List (http://www.php.net/)
>>>>To unsubscribe, visit: http://www.php.net/unsub.php
>>>>
>>>>
>>>
>>>
>>>--
>>>PHP General Mailing List (http://www.php.net/)
>>>To unsubscribe, visit: http://www.php.net/unsub.php
>>>
>>>
>>
> 



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to