RE: Can not index raw binary data stored in Database in BLOB format.

2014-02-25 Thread Chandan khatua
Hi Gora, The column type in DB is BLOB. It only stores binary data. If I do not use TikaEntityProcessor, then the following exception occurs: at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:457) 59163 [Thread-16] ERROR

Re: Can not index raw binary data stored in Database in BLOB format.

2014-02-25 Thread Gora Mohanty
On 25 February 2014 14:54, Chandan khatua chand...@nrifintech.com wrote: Hi Gora, The column type in DB is BLOB. It only stores binary data. If I do not use TikaEntityProcessor, then the following exception occurs: [...] It is difficult to follow what you are doing when you say one thing,

RE: Can not index raw binary data stored in Database in BLOB format.

2014-02-25 Thread Chandan khatua
Okey. Here is my data-config file: ?xml version=1.0 encoding=UTF-8 ? dataConfig dataSource name=db driver=oracle.jdbc.driver.OracleDriver url=jdbc:oracle:thin:@//1.2.3.4:1/d11gr21 user= password= / dataSource name=dastream type=FieldStreamDataSource/ document entity

Re: Can not index raw binary data stored in Database in BLOB format.

2014-02-25 Thread Raymond Wiker
A few things: 1) If your database uses a BLOB, you should not use clobtransformer; FieldStreamDataSource should be sufficient. 2) In a previous message, it showed that the converted/etxracted document was empty (except for an html boilerplate wrapper). This was using the configuration I

Re: Can not index raw binary data stored in Database in BLOB format.

2014-02-24 Thread Gora Mohanty
On 24 February 2014 12:51, Chandan khatua chand...@nrifintech.com wrote: Hi, We have raw binary data stored in database(not word,excel,xml etc files) in BLOB. We are trying to index using TikaEntityProcessor but nothing seems to get indexed. But the same configuration works when

RE: Can not index raw binary data stored in Database in BLOB format.

2014-02-24 Thread Chandan khatua
Hi Gora ! Your concern was What is the type of the column used to store the binary data in Oracle? The column type is BLOB in DB. The column can also have rich text file. Regards, Chandan -Original Message- From: Gora Mohanty [mailto:g...@mimirtech.com] Sent: Monday, February 24,

Re: Can not index raw binary data stored in Database in BLOB format.

2014-02-24 Thread Raymond Wiker
I've done something like this; the key was to use a FieldStreamDataSource to read from the BLOB field. Something like datasource name=main ... dataSource type=FieldStreamDataSource name=fieldstream/ then entity name=tika processor=TikaEntityProcessor dataField=main.BLOB

RE: Can not index raw binary data stored in Database in BLOB format.

2014-02-24 Thread Chandan khatua
Hi Raymond ! I've data-config.xml like bellow: ?xml version=1.0 encoding=UTF-8 ? dataConfig dataSource name=db driver=oracle.jdbc.driver.OracleDriver url=jdbc:oracle:thin:@//x.x.x.x:x/d11gr21 user=x password=x/ dataSource name=dastream type=FieldStreamDataSource / document entity

Re: Can not index raw binary data stored in Database in BLOB format.

2014-02-24 Thread Raymond Wiker
Try replacing the inner entity with something like entity name=message dataSource=dastream processor=TikaEntityProcessor dataField=messages.MESSAGE format=xml field column=text name=mxMsg/ /entity --- this assumes that you get the blob from a

RE: Can not index raw binary data stored in Database in BLOB format.

2014-02-24 Thread Chandan khatua
I've tried as per your guide. But, no data are indexing. The output of Query screen looks like : doc str name=id2158/str arr name=mxMsg str?xml version=1.0 encoding=UTF-8?html xmlns=http://www.w3.org/1999/xhtml; head meta name=Content-Type content=application/octet-stream/ title/

Re: Can not index raw binary data stored in Database in BLOB format.

2014-02-24 Thread Raymond Wiker
Try running the query for the outer entity (messages) in an sql client, and verify that your blob column is called MESSAGE. On Mon, Feb 24, 2014 at 12:22 PM, Chandan khatua chand...@nrifintech.comwrote: I've tried as per your guide. But, no data are indexing. The output of Query screen looks

Re: Can not index raw binary data stored in Database in BLOB format.

2014-02-24 Thread Gora Mohanty
On 24 February 2014 15:34, Chandan khatua chand...@nrifintech.com wrote: Hi Gora ! Your concern was What is the type of the column used to store the binary data in Oracle? The column type is BLOB in DB. The column can also have rich text file. Um, your original message said that it does

RE: Can not index raw binary data stored in Database in BLOB format.

2014-02-24 Thread Chandan khatua
I have verified that blob column is called MESSAGE. In my data-config file the field column named 'id' is indexed in solr. But the data(field column name=mxMsg) is not indexed. It comes empty with in quotes. The same configuration is working on xml data (stored BLOB type in DB), But not on