Problem on understanding how Nutch save the information to it's filesystem

William Choi Mon, 22 May 2006 11:26:42 -0700

Hi ALL,
  
    I've download the latest Nutch 0.8.0 and hadoop(05-08-06) version, I  want 
to study how Nutch put the information to the filesystem across  all the 
datanodes, (like if I already have my stuff indexed, how do I  put them to the 
hadoop filesystem). I've searched online but there's  not many info about it. 
I've studied the codes of Nutch and hadoop but  it just makes me more confused. 
I need some experts to give me a big  picture or guide me to start.
  
      Thanks for Andrzej previous reply, mention that using hadoop dfs 
copyFromLocal will work. So here's my other questions, 
  1) using the command copyFromLocal, is the search will get the right  data by 
itself? I assume I'll still need to do some more work to make  it work.
  
  2) Since I don't get the big  picture yet, what kind of data input 
requirement is needed for  copyFromLocal? since I have my own modified lucene 
that will index all  the data for me, and those data type may not be the same 
as nutch, if I  use that command and put the data into the filesystem, what 
else should  I implement inorder to do the search, or update?
  
  3) What files should I study that's related to this part of work?
  
     Any help will be appreciated.
  
  William
  
                        
---------------------------------
Sneak preview the  all-new Yahoo.com. It's not radically different. Just 
radically better.

Problem on understanding how Nutch save the information to it's filesystem

Reply via email to