alex, thank you for the link.
i enabled the trace for 'org.apache.solr.handler.dataimport' and it
seems as if the database is only called once:
<record>
<date>2013-03-21T09:40:43</date>
<millis>1363855243889</millis>
<sequence>50</sequence>
<logger>org.apache.solr.handler.dataimport.JdbcDataSource</logger>
<level>FINE</level>
<class>org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator</class>
<method><init></method>
<thread>11</thread>
<message>Executing SQL: select * from doc_properties where
DOCID='0u3xouyscdhye61o'</message>
</record>
therefore i assume the output shown in the dataimporthandler UI is
incorrect. i could doublecheck with the database logs
cheerio,
patrick
On 20.03.2013 12:07, Alexandre Rafalovitch wrote:
There was something like this on Stack Overflow:
http://stackoverflow.com/questions/15164166/solr-filelistentityprocessor-is-executing-sub-entities-multiple-times
Upgrading Solr helped partially, but the conclusion was not fully
satisfactory.
Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working. (Anonymous - via GTD book)
On Wed, Mar 20, 2013 at 6:48 AM, patrick <preic...@hotmail.com> wrote:
hi,
the dataimport-config-file i'm using with solr3.6.2 uses a nested select
statement. the first query retrieves
the documents while the nested one retrieves the corresponding properties.
<dataConfig>
<!--Data source to connect to database-->
<dataSource name="db" driver="oracle.jdbc.driver.**OracleDriver"
url="jdbc:oracle:thin:@alexis:**1521:FMSPRF01" user="?????"
password="?????"/>
<document name="doc">
<entity name="item" query="select DOCID from documents">
<field column="DOCID" name="index_id" />
<entity name="attributes"
query="select * from doc_properties where DOCID='${item.DOCID}'">
<!-- do something -->
</entity>
</entity>
</document>
</dataConfig>
when running the dataimporthandler with the verbose/debug flag turned on
the output lists more than one query for
'entity:attributes' - this list is increased for each 'entity:item':
....
<arr name="documents"/>
<lst name="verbose-output">
<lst name="entity:item">
<lst name="document#1">
<str name="query">select DOCID from documents</str>
<str name="time-taken">0:0:0.50</**str>
<str>----------- row #1-------------</str>
<str name="DOCID">000emnslnbh88hdd<**/str>
<str>-------------------------**--------------------</str>
<lst name="entity:attributes">
<str name="query">select * from doc_properties where
DOCID='000emnslnbh88hdd'</str>
<str name="query">select * from doc_properties where
DOCID='000emnslnbh88hdd'</str>
<str name="time-taken">0:0:0.37</**str>
<str name="time-taken">0:0:0.37</**str>
<str>----------- row #1-------------</str>
<str name="VALUE">I</str>
<str name="PROPERTY_KEY">message_**direction</str>
<str>-------------------------**--------------------</str>
<str>----------- row #2-------------</str>
<str name="VALUE">heb@test</str>
<str name="PROPERTY_KEY">message_**event_source</str>
....
<lst name="document#2">
<str>----------- row #1-------------</str>
<str name="DOCID">000hsjunnbh7weq8<**/str>
<str>-------------------------**--------------------</str>
<lst name="entity:attributes">
<str name="query">select * from doc_properties where
DOCID='000hsjunnbh7weq8'</str>
<str name="query">select * from doc_properties where
DOCID='000hsjunnbh7weq8'</str>
<str name="query">select * from doc_properties where
DOCID='000hsjunnbh7weq8'</str>
<str name="query">select * from doc_properties where
DOCID='000hsjunnbh7weq8'</str>
<str name="time-taken">0:0:0.1</**str>
<str name="time-taken">0:0:0.1</**str>
<str name="time-taken">0:0:0.1</**str>
<str name="time-taken">0:0:0.1</**str>
<str>----------- row #1-------------</str>
<str name="VALUE">I</str>
<str name="PROPERTY_KEY">message_**direction</str>
<str>-------------------------**--------------------</str>
<str>----------- row #2-------------</str>
<str name="VALUE">heb@test</str>
<str name="PROPERTY_KEY">message_**event_source</str>
...
i was wondering if there's something wrong with my configuration - thank
you for clarifying,
patrick