Question about how to upload XML by using SolrJ Client Java Code

2014-02-12 Thread Eric_Peng
 I was just trying to use SolrJ Client to import XML data to Solr server. And
I read SolrJ wiki that says SolrJ lets you upload content in XML and Binary
format 

I realized there is a XML parser in Solr (We can use a dataUpadateHandler in
Solr default UI Solr Core Dataimport)

So I was wondering how to directly use solr xml parser to upload xml by
using SolrJ Java Code? I could use other open-source xml parser, But I
really want to know if there is a way to call Solr parser library.

Would you mind send me a simple code if possible, really appreciated.
Thanks in advance.

solr/4.6.1
 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Question-about-how-to-upload-XML-by-using-SolrJ-Client-Java-Code-tp4116901.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Question about how to upload XML by using SolrJ Client Java Code

2014-02-12 Thread Erick Erickson
Hmmm, before going there let's be sure you're trying to do
what you think you are.

Solr does _not_ index arbitrary XML. There is a very
specific format of XML that describes solr documents
that _can_ be indexed. But random XML is not
supported. See the documents in example/exampledocs
for the XML form of Solr docs.

So if you have arbitrary XML, you need to parse it and then
construct Solr documents. One way would be to use
SolrJ, parse the docs using your favorite Java parser and
construct SolrInputDocuments which you then use one of
the SolrServer classes (e.g. CloudSolrServer) to add to the index.

There really is no Solr MXL Parser that I know of, Solr just
uses one of the standard XML parsers (e.g. sax)...

Best,
Erick


On Wed, Feb 12, 2014 at 7:21 AM, Eric_Peng sagittariuse...@gmail.comwrote:

  I was just trying to use SolrJ Client to import XML data to Solr server.
 And
 I read SolrJ wiki that says SolrJ lets you upload content in XML and
 Binary
 format

 I realized there is a XML parser in Solr (We can use a dataUpadateHandler
 in
 Solr default UI Solr Core Dataimport)

 So I was wondering how to directly use solr xml parser to upload xml by
 using SolrJ Java Code? I could use other open-source xml parser, But I
 really want to know if there is a way to call Solr parser library.

 Would you mind send me a simple code if possible, really appreciated.
 Thanks in advance.

 solr/4.6.1




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Question-about-how-to-upload-XML-by-using-SolrJ-Client-Java-Code-tp4116901.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Question about how to upload XML by using SolrJ Client Java Code

2014-02-12 Thread Shawn Heisey
On 2/12/2014 8:21 AM, Eric_Peng wrote:
  I was just trying to use SolrJ Client to import XML data to Solr server. And
 I read SolrJ wiki that says SolrJ lets you upload content in XML and Binary
 format 
 
 I realized there is a XML parser in Solr (We can use a dataUpadateHandler in
 Solr default UI Solr Core Dataimport)
 
 So I was wondering how to directly use solr xml parser to upload xml by
 using SolrJ Java Code? I could use other open-source xml parser, But I
 really want to know if there is a way to call Solr parser library.
 
 Would you mind send me a simple code if possible, really appreciated.
 Thanks in advance.
 
 solr/4.6.1

When the docs say that SolrJ lets you upload data in XML and binary
format, what they actually mean is that SolrJ will create an update
request that is formatted using XML, not that it will let you send
arbitrary XML data.  It is referring to the specific XML format shown here:

http://wiki.apache.org/solr/UpdateXmlMessages#add.2Freplace_documents

As for an XML parser ... SolrJ's XMLResponseParser is a class that
accepts XML *responses* from Solr and translates them into the Java
response object.  There is also BinaryResponseParser.

The only things that I am aware of in Solr that will deal with XML as
the data source are the XPathEntityProcessor in the dataimport handler
and the ExtractingRequestHandler which uses Apache Tika.  Both of these
are actually contrib modules -- jar files for these features are in the
download, but not built into Solr or SolrJ.

If you are using the extracting request handler, you could probably use
the DirectXmlRequest object, where 'xml' is a String with the xml in it:

  DirectXmlRequest req = new DirectXmlRequest( /update/extract, xml );
  ModifiableSolrParams params = new ModifiableSolrParams();
  params.set(someParam, someValue);
  req.setParams(params);
  NamedListObject response = solrServer.request(req);

I hope that you are right and there actually is an XML parser built into
SolrJ.  We would both learn something.

Thanks,
Shawn



Re: Question about how to upload XML by using SolrJ Client Java Code

2014-02-12 Thread Eric_Peng
Thanks a lot, learnt a lot from it



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Question-about-how-to-upload-XML-by-using-SolrJ-Client-Java-Code-tp4116901p4116937.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Question about how to upload XML by using SolrJ Client Java Code

2014-02-12 Thread Eric_Peng
Thanks you so much Erick, I will try to write my owe XML parser




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Question-about-how-to-upload-XML-by-using-SolrJ-Client-Java-Code-tp4116901p4116936.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Question about how to upload XML by using SolrJ Client Java Code

2014-02-12 Thread Jack Krupansky
There is also an XSLT update handler option to transform raw XML to Solr 
XML on the fly. If anybody here has used it, feel free to chime in.


See:
http://wiki.apache.org/solr/XsltUpdateRequestHandler
and
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-UsingXSLTtoTransformXMLIndexUpdates

-- Jack Krupansky

-Original Message- 
From: Eric_Peng

Sent: Wednesday, February 12, 2014 11:42 AM
To: solr-user@lucene.apache.org
Subject: Re: Question about how to upload XML by using SolrJ Client Java 
Code


Thanks you so much Erick, I will try to write my owe XML parser




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Question-about-how-to-upload-XML-by-using-SolrJ-Client-Java-Code-tp4116901p4116936.html
Sent from the Solr - User mailing list archive at Nabble.com.