Re: PlainTextEntityProcessor and RegexTransformer in DataImport Handler

2011-12-26 Thread meghana
Thanks Matthew ,

Its really helped a lot. i am about to done with this. 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/PlainTextEntityProcessor-and-RegexTransformer-in-DataImport-Handler-tp3608449p3612674.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: PlainTextEntityProcessor and RegexTransformer in DataImport Handler

2011-12-24 Thread Matthew Parker
I would try something like the following:

dataConfig
dataSource type=FileDataSource /
script![CDATA[
function format(row){
var text = row.get(plainText)

// do regex processsing with Javascript's RegExp object.

row.put(all_text, results );   // store results in
the all_text field.
return row;
}
]]/script
document
entity name=f processor=FileListEntityProcessor baseDir=[path
to text file directory] fileName=.*txt rootEntity=false
dataSource=null
entity name=x processor=PlainTextEntityProcessor
url=${f.fileAbsolutePath} rootEntity=true dataSource=null
transformer=script:format/entity
/entity
/document
/dataConfig


On Fri, Dec 23, 2011 at 7:41 AM, meghana meghana.rav...@amultek.com wrote:

 Hi..

 Plz anybody have any idea? how can i achieve this?

 also if it is possible to convert multivalued field to non-multicalued
 field
 then it would aslo work for me.

 I have custom mustivalued field ArrText, which have value as shown below
 arr name=ArrText 
str12 : Hello World!!/str
str14 : Welcome to Solr./str
str15 : Enjoy/str
 /arr

 if we can convert this as my desired result then it would be great.
 Thanks in Adcance.
 Meghana

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/PlainTextEntityProcessor-and-RegexTransformer-in-DataImport-Handler-tp3608449p3608726.html
 Sent from the Solr - User mailing list archive at Nabble.com.


--
This e-mail and any files transmitted with it may be proprietary.  Please note 
that any views or opinions presented in this e-mail are solely those of the 
author and do not necessarily represent those of Apogee Integration.


PlainTextEntityProcessor and RegexTransformer in DataImport Handler

2011-12-23 Thread meghana
Hi all, 

I need to import data from my text file (which have HTML text). and need to
apply some formatting on it. i want all text with in p tag , and i want it
to be preceded by one element of p tag in my output,  like below.

Original Text
--
divp  myvar=12 myvar1=xyzHello World!!/pp  myvar=14
myvar1=abcWelcome to Solr./pp  myvar=15 myvar1=defEnjoy/p/div


Needed Text After Formattting
--
12 : Hello World!!
14 : Welcome to Solr.
15 : Enjoy

I have applied combination of PlainTextEntityProcessor with RegexTransformer
and TemplateTransformer for that as below. but i am receiving
ConfigurationError when i set that.

entity name=xx onError=continue  processor=PlainTextEntityProcessor
transformer=TemplateTransformer,RegexTransformer url=${URL.MyTxtFile}
dataSource=MDataSource
   field column=plainText name=FullText   /
   field column=quot;FullTextquot;
template=quot;${xx.FullText}quot; regex='lt;p (?:\s+[^]+)?
myvar=([^]*) (?:\s+[^]+)?([^]*)/p' replaceWith=$2 : $4/
   /entity

I like to add here that i am able do this using TemplateTransformer and
multivalued field by setting foreach on entity, but i need above format in
single valued field, for which i am failed to do it.

Can any body help me, how can i get my desired result? or what i am doing
wrong on above transformer?
Thanks
Meghana

--
View this message in context: 
http://lucene.472066.n3.nabble.com/PlainTextEntityProcessor-and-RegexTransformer-in-DataImport-Handler-tp3608449p3608449.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: PlainTextEntityProcessor and RegexTransformer in DataImport Handler

2011-12-23 Thread meghana
Hi..

Plz anybody have any idea? how can i achieve this? 

also if it is possible to convert multivalued field to non-multicalued field
then it would aslo work for me.

I have custom mustivalued field ArrText, which have value as shown below
arr name=ArrText 
str12 : Hello World!!/str
str14 : Welcome to Solr./str
str15 : Enjoy/str
/arr

if we can convert this as my desired result then it would be great.
Thanks in Adcance.
Meghana

--
View this message in context: 
http://lucene.472066.n3.nabble.com/PlainTextEntityProcessor-and-RegexTransformer-in-DataImport-Handler-tp3608449p3608726.html
Sent from the Solr - User mailing list archive at Nabble.com.