Hi Gora,
The column type in DB is BLOB. It only stores binary data.
If I do not use TikaEntityProcessor, then the following exception occurs:
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:457)
59163 [Thread-16] ERROR
On 25 February 2014 14:54, Chandan khatua chand...@nrifintech.com wrote:
Hi Gora,
The column type in DB is BLOB. It only stores binary data.
If I do not use TikaEntityProcessor, then the following exception occurs:
[...]
It is difficult to follow what you are doing when you say one thing,
Okey.
Here is my data-config file:
?xml version=1.0 encoding=UTF-8 ?
dataConfig
dataSource name=db driver=oracle.jdbc.driver.OracleDriver
url=jdbc:oracle:thin:@//1.2.3.4:1/d11gr21 user= password= /
dataSource name=dastream type=FieldStreamDataSource/
document
entity
A few things:
1) If your database uses a BLOB, you should not use clobtransformer;
FieldStreamDataSource should be sufficient.
2) In a previous message, it showed that the converted/etxracted document
was empty (except for an html boilerplate wrapper). This was using the
configuration I
On 24 February 2014 12:51, Chandan khatua chand...@nrifintech.com wrote:
Hi,
We have raw binary data stored in database(not word,excel,xml etc files) in
BLOB.
We are trying to index using TikaEntityProcessor but nothing seems to get
indexed.
But the same configuration works when
Hi Gora !
Your concern was What is the type of the column used to store the binary
data in Oracle?
The column type is BLOB in DB. The column can also have rich text file.
Regards,
Chandan
-Original Message-
From: Gora Mohanty [mailto:g...@mimirtech.com]
Sent: Monday, February 24,
I've done something like this; the key was to use a FieldStreamDataSource
to read from the BLOB field.
Something like
datasource name=main ...
dataSource type=FieldStreamDataSource name=fieldstream/
then
entity name=tika processor=TikaEntityProcessor
dataField=main.BLOB
Hi Raymond !
I've data-config.xml like bellow:
?xml version=1.0 encoding=UTF-8 ?
dataConfig
dataSource name=db driver=oracle.jdbc.driver.OracleDriver
url=jdbc:oracle:thin:@//x.x.x.x:x/d11gr21 user=x password=x/
dataSource name=dastream type=FieldStreamDataSource /
document
entity
Try replacing the inner entity with something like
entity name=message
dataSource=dastream
processor=TikaEntityProcessor
dataField=messages.MESSAGE
format=xml
field column=text name=mxMsg/
/entity
--- this assumes that you get the blob from a
I've tried as per your guide. But, no data are indexing.
The output of Query screen looks like :
doc
str name=id2158/str
arr name=mxMsg
str?xml version=1.0 encoding=UTF-8?html
xmlns=http://www.w3.org/1999/xhtml;
head
meta name=Content-Type content=application/octet-stream/
title/
Try running the query for the outer entity (messages) in an sql client,
and verify that your blob column is called MESSAGE.
On Mon, Feb 24, 2014 at 12:22 PM, Chandan khatua chand...@nrifintech.comwrote:
I've tried as per your guide. But, no data are indexing.
The output of Query screen looks
On 24 February 2014 15:34, Chandan khatua chand...@nrifintech.com wrote:
Hi Gora !
Your concern was What is the type of the column used to store the binary
data in Oracle?
The column type is BLOB in DB. The column can also have rich text file.
Um, your original message said that it does
I have verified that blob column is called MESSAGE.
In my data-config file the field column named 'id' is indexed in solr. But
the data(field column name=mxMsg) is not indexed. It comes empty with in
quotes.
The same configuration is working on xml data (stored BLOB type in DB), But
not on
13 matches
Mail list logo