David Jashi wrote: > > > Wow. Does it mean we'll have live indexing out of the box?
> If by "live" you mean that you can index a fetched & parsed segment, and have it appear > immediately in live search after you commit, then yes. Other than that, Nutch still uses > segments as a unit of work, so the segment generation / fetch / parsing / updatedb etc. are > still batch operations that take time. Yes, that's what I meant. Very nice. > > By the way, is there any chance to modify stemming to process several > > wordforms (tokens) at once, and not one-by one? That would really > > increase speed of my external stemming. > You can implement your own analyzer, which first caches all tokens from TokenStream, and > then passes them all at once to the external process. Thanks for the hint, I'll dig in that direction. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com -- with best regards, David Jashi Web development EO, Caucasus Online +995(32)970368 [email protected] პატივისცემით, დავით ჯაში ვებ–განვითარების დირექტორი "კავკასუს ონლაინი" +995(32)970368 [email protected]
