2.0 and if possible 1.3, although the latter might not see daylight. Thanks for the patch!
On Friday 26 November 2010 16:19:58 Claudio Martella wrote: > Markus, with trunk you mean 1.3 or 2.0? The patches should apply to all > 1.x. > > On 11/26/10 3:15 PM, Markus Jelsma wrote: > > As reference to other readers: > > https://issues.apache.org/jira/browse/NUTCH-939 > > > > On Friday 26 November 2010 11:59:26 Claudio Martella wrote: > >> Hello list, > >> > >> I'm porting recrawl script to use hadoop (on an already existing hadoop > >> cluster). I attach my version. > >> > >> What i found out is that Indexer and SolrIndexer want a list of > >> segments. It's difficult to obtain the content of a directory through > >> hdfs (/craw/segments/* will be expanded by bash and hadoop dfs -ls will > >> return the content with details such as permissions, owners and dates), > >> so I wrote these little patches to add the -dir option like > >> SegmentMerger and LinkDB. They are attached too. > >> > >> They might be of interest for somebody else. -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350