Thanks Timothy for your short answer, I guess I have to be a bit more
specific!
Actually I'm interested in the distributed FS rather than in the
Map/Reduce features. Did the HDFS change very much since it has been
moved out of the nutch project?As far as I can tell the version of
hadoop is a very very early one but this hasn't been developed from
the scratch right?
I can remember that nutch has been around for a while and nutch does
make use of HDFS as well.
Is there anybody around who runs hadoop in a production env. at all?
Yes, there are. :) It works ...but we had quite some problems and had to
maintain our own patches to get away with it. It still feels quite rough
around the edges.
Timothy talked about "major bugs" still coming up, so are they rather
related to new features or also to the DFS? (I don't need details
don't worry)
Also DFS! ...we had a e.g. a StackOverflow Exception just because we had
too many files in one directory. But in general it seems OK.
What I need to know is if it is worth to use HDFS to build a prototype
and a proof of concept for an upcoming project. If we can not use it
due to it's early dev. state it's ok but we could save heaps of time
with some information or recommendations.
I found the setup and the script to run it quite painful for a real
production environment. We now have proper init.d scripts and all
packaged up nicely as debian packages. So installation became a one-
liner.
Hope we can contribute that back soon ...but atm not much time for that.
As of now we are running 10.1 + patches.
HTH
cheers
--
Torsten