in wiki text external links look like [url description] A category declaration takes the form [[Category:*Category name*]] or [[Category:*Category name*|*Sortkey*]].
writing an anlyzer is pretty simple !! good luck! On Sun, May 13, 2012 at 8:55 PM, vineet yadav <vineet.yadav.i...@gmail.com>wrote: > Hi all, > I want to create Lucene/Solr index of wikipedia xml dump. I used Solr > example( > http://wiki.apache.org/solr/DataImportHandler#Example:_Indexing_wikipedia) > to index wikipedia xml dump. Since in wikipedia, Category and external > links are part of wikipedia text, I am not able to index category and > external links separately. I want to index Category, Externals > links etc separately and store them in separate fields. > Would anyone please be kind enough to give me a bit of advice? > Thanks > Vineet Yadav > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > -- Oren Bochman Office tel. 061 4921492 Mobile +36 30 866 6706 skype id: orenbochman e-mail: o...@romai-horizon.com site http://www.riverport.hu