Re: PlainTextEntityProcessor and RegexTransformer in DataImport Handler
Thanks Matthew , Its really helped a lot. i am about to done with this. -- View this message in context: http://lucene.472066.n3.nabble.com/PlainTextEntityProcessor-and-RegexTransformer-in-DataImport-Handler-tp3608449p3612674.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: PlainTextEntityProcessor and RegexTransformer in DataImport Handler
I would try something like the following: dataConfig dataSource type=FileDataSource / script![CDATA[ function format(row){ var text = row.get(plainText) // do regex processsing with Javascript's RegExp object. row.put(all_text, results ); // store results in the all_text field. return row; } ]]/script document entity name=f processor=FileListEntityProcessor baseDir=[path to text file directory] fileName=.*txt rootEntity=false dataSource=null entity name=x processor=PlainTextEntityProcessor url=${f.fileAbsolutePath} rootEntity=true dataSource=null transformer=script:format/entity /entity /document /dataConfig On Fri, Dec 23, 2011 at 7:41 AM, meghana meghana.rav...@amultek.com wrote: Hi.. Plz anybody have any idea? how can i achieve this? also if it is possible to convert multivalued field to non-multicalued field then it would aslo work for me. I have custom mustivalued field ArrText, which have value as shown below arr name=ArrText str12 : Hello World!!/str str14 : Welcome to Solr./str str15 : Enjoy/str /arr if we can convert this as my desired result then it would be great. Thanks in Adcance. Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/PlainTextEntityProcessor-and-RegexTransformer-in-DataImport-Handler-tp3608449p3608726.html Sent from the Solr - User mailing list archive at Nabble.com. -- This e-mail and any files transmitted with it may be proprietary. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of Apogee Integration.
PlainTextEntityProcessor and RegexTransformer in DataImport Handler
Hi all, I need to import data from my text file (which have HTML text). and need to apply some formatting on it. i want all text with in p tag , and i want it to be preceded by one element of p tag in my output, like below. Original Text -- divp myvar=12 myvar1=xyzHello World!!/pp myvar=14 myvar1=abcWelcome to Solr./pp myvar=15 myvar1=defEnjoy/p/div Needed Text After Formattting -- 12 : Hello World!! 14 : Welcome to Solr. 15 : Enjoy I have applied combination of PlainTextEntityProcessor with RegexTransformer and TemplateTransformer for that as below. but i am receiving ConfigurationError when i set that. entity name=xx onError=continue processor=PlainTextEntityProcessor transformer=TemplateTransformer,RegexTransformer url=${URL.MyTxtFile} dataSource=MDataSource field column=plainText name=FullText / field column=quot;FullTextquot; template=quot;${xx.FullText}quot; regex='lt;p (?:\s+[^]+)? myvar=([^]*) (?:\s+[^]+)?([^]*)/p' replaceWith=$2 : $4/ /entity I like to add here that i am able do this using TemplateTransformer and multivalued field by setting foreach on entity, but i need above format in single valued field, for which i am failed to do it. Can any body help me, how can i get my desired result? or what i am doing wrong on above transformer? Thanks Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/PlainTextEntityProcessor-and-RegexTransformer-in-DataImport-Handler-tp3608449p3608449.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: PlainTextEntityProcessor and RegexTransformer in DataImport Handler
Hi.. Plz anybody have any idea? how can i achieve this? also if it is possible to convert multivalued field to non-multicalued field then it would aslo work for me. I have custom mustivalued field ArrText, which have value as shown below arr name=ArrText str12 : Hello World!!/str str14 : Welcome to Solr./str str15 : Enjoy/str /arr if we can convert this as my desired result then it would be great. Thanks in Adcance. Meghana -- View this message in context: http://lucene.472066.n3.nabble.com/PlainTextEntityProcessor-and-RegexTransformer-in-DataImport-Handler-tp3608449p3608726.html Sent from the Solr - User mailing list archive at Nabble.com.