Okey. Here is my data-config file:
<?xml version="1.0" encoding="UTF-8" ?> <dataConfig> <dataSource name="db" driver="oracle.jdbc.driver.OracleDriver" url="jdbc:oracle:thin:@//1.2.3.4:1/d11gr21" user="aaaa" password="aaaa" /> <dataSource name="dastream" type="FieldStreamDataSource"/> <document> <entity name="messages" pk="X_MSG_PK" query="select * from table1" dataSource="db"> <field column ="X_MSG_PK" name ="id" /> <entity name="message" transformer="ClobTransformer" dataSource="dastream" processor="TikaEntityProcessor" dataField="messages.MESSAGE" format="text"> <field column="text" name="mxMsg" clob="true"/> </entity> </entity> </document> </dataConfig> ---------------------------------------------------------------------------- ---------------------- Solr.log file : INFO - 2014-02-25 17:33:40.023; org.apache.solr.core.SolrCore; [CHESS_CORE] webapp=/solr path=/admin/mbeans params={cat=QUERYHANDLER&_=1393329819994&wt=json} status=0 QTime=1 INFO - 2014-02-25 17:33:40.094; org.apache.solr.core.SolrCore; [CHESS_CORE] webapp=/solr path=/admin/mbeans params={cat=QUERYHANDLER&_=1393329820083&wt=json} status=0 QTime=0 INFO - 2014-02-25 17:33:40.117; org.apache.solr.core.SolrCore; [CHESS_CORE] webapp=/solr path=/dataimport params={indent=true&command=status&_=1393329820089&wt=json} status=0 QTime=16 INFO - 2014-02-25 17:33:40.131; org.apache.solr.core.SolrCore; [CHESS_CORE] webapp=/solr path=/dataimport params={indent=true&command=show-config&_=1393329820084} status=0 QTime=29 INFO - 2014-02-25 17:33:42.026; org.apache.solr.handler.dataimport.DataImporter; Loading DIH Configuration: /dataconfig/data-config.xml INFO - 2014-02-25 17:33:42.031; org.apache.solr.handler.dataimport.DataImporter; Data Configuration loaded successfully INFO - 2014-02-25 17:33:42.033; org.apache.solr.core.SolrCore; [CHESS_CORE] webapp=/solr path=/dataimport params={optimize=false&indent=true&clean=true&commit=true&verbose=false&comm and=full-import&debug=false&wt=json} status=0 QTime=8 INFO - 2014-02-25 17:33:42.035; org.apache.solr.handler.dataimport.DataImporter; Starting Full Import INFO - 2014-02-25 17:33:42.043; org.apache.solr.core.SolrCore; [CHESS_CORE] webapp=/solr path=/dataimport params={indent=true&command=status&_=1393329822040&wt=json} status=0 QTime=0 INFO - 2014-02-25 17:33:42.064; org.apache.solr.handler.dataimport.SimplePropertiesWriter; Read dataimport.properties INFO - 2014-02-25 17:33:42.092; org.apache.solr.search.SolrIndexSearcher; Opening Searcher@2a858a73 realtime INFO - 2014-02-25 17:33:42.093; org.apache.solr.handler.dataimport.JdbcDataSource$1; Creating a connection for entity messages with URL: jdbc:oracle:thin:@//172.16.29.92:1521/d11gr21 INFO - 2014-02-25 17:33:42.113; org.apache.solr.handler.dataimport.JdbcDataSource$1; Time taken for getConnection(): 19 INFO - 2014-02-25 17:33:42.564; org.apache.solr.handler.dataimport.DocBuilder; Import completed successfully INFO - 2014-02-25 17:33:42.564; org.apache.solr.update.DirectUpdateHandler2; start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=fa lse,softCommit=false,prepareCommit=false} INFO - 2014-02-25 17:33:42.867; org.apache.solr.core.SolrDeletionPolicy; SolrDeletionPolicy.onCommit: commits: num=2 commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@C:\solr -4.5.1\example\multicore\CHESS_CORE\data\index lockFactory=org.apache.lucene.store.NativeFSLockFactory@2c6d8073; maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_l,generation=21} commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@C:\solr -4.5.1\example\multicore\CHESS_CORE\data\index lockFactory=org.apache.lucene.store.NativeFSLockFactory@2c6d8073; maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_m,generation=22} INFO - 2014-02-25 17:33:42.868; org.apache.solr.core.SolrDeletionPolicy; newest commit generation = 22 INFO - 2014-02-25 17:33:42.882; org.apache.solr.search.SolrIndexSearcher; Opening Searcher@558ea0cc main INFO - 2014-02-25 17:33:42.886; org.apache.solr.core.QuerySenderListener; QuerySenderListener sending requests to Searcher@558ea0cc main{StandardDirectoryReader(segments_m:55:nrt _d(4.5.1):C80)} INFO - 2014-02-25 17:33:42.889; org.apache.solr.core.QuerySenderListener; QuerySenderListener done. INFO - 2014-02-25 17:33:42.889; org.apache.solr.core.SolrCore; [CHESS_CORE] Registered new searcher Searcher@558ea0cc main{StandardDirectoryReader(segments_m:55:nrt _d(4.5.1):C80)} INFO - 2014-02-25 17:33:42.893; org.apache.solr.update.DirectUpdateHandler2; end_commit_flush INFO - 2014-02-25 17:33:42.899; org.apache.solr.handler.dataimport.SimplePropertiesWriter; Read dataimport.properties INFO - 2014-02-25 17:33:42.901; org.apache.solr.handler.dataimport.SimplePropertiesWriter; Wrote last indexed time to dataimport.properties INFO - 2014-02-25 17:33:42.905; org.apache.solr.handler.dataimport.DocBuilder; Time taken = 0:0:0.839 INFO - 2014-02-25 17:33:42.905; org.apache.solr.update.processor.LogUpdateProcessor; [CHESS_CORE] webapp=/solr path=/dataimport params={optimize=false&indent=true&clean=true&commit=true&verbose=false&comm and=full-import&debug=false&wt=json} status=0 QTime=8 {deleteByQuery=*:* (-1461012211508969472),add=[2158 (1461012211583418368), 2265 (1461012211591806976), 2225 (1461012211597049856), 2241 (1461012211602292736), 2276 (1461012211607535616), 2277 (1461012211612778496), 2302 (1461012211619069952), 4558 (1461012211624312832), 2144 (1461012211629555712), 2145 (1461012211635847168), ... (80 adds)],commit=} 0 8 INFO - 2014-02-25 17:33:47.623; org.apache.solr.core.SolrCore; [CHESS_CORE] webapp=/solr path=/dataimport params={indent=true&command=status&_=1393329827620&wt=json} status=0 QTime=1 ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ------------------------- Part of Query result screen : "docs": [ { "id": "2158", "mxMsg": [ "" ], "_version_": 1461012211583418400 }, { "id": "2265", "mxMsg": [ "" ], "_version_": 1461012211591807000 }, ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ---- As you see, 'id' is indexed properly, but 'mxMsg' is empty. ---------------------------------------------------------------------------- ------------------------------------------------------- Now, please suggest me so that I can get data in 'mxMsg' field. The binary data is stored inDB as BLOB type. Please note: The same configuration is working fine ('mxMsg' displays data if XML data are in DB as BLOB type). Please help, Looking forward, Chandan -----Original Message----- From: Gora Mohanty [mailto:g...@mimirtech.com] Sent: Tuesday, February 25, 2014 4:35 PM To: solr-user@lucene.apache.org Subject: Re: Can not index raw binary data stored in Database in BLOB format. On 25 February 2014 14:54, Chandan khatua <chand...@nrifintech.com> wrote: > Hi Gora, > > The column type in DB is BLOB. It only stores binary data. > > If I do not use TikaEntityProcessor, then the following exception occurs: [...] It is difficult to follow what you are doing when you say one thing, and seem to do another. You say above that you are not using TikaEntityProcessor but your DIH data configuration file shows that you are. Please start with one configuration, and show us the *exact* files in use, and the error from the Solr logs. Regards, Gora