[
https://issues.apache.org/jira/browse/SOLR-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13738424#comment-13738424
]
James Dyer commented on SOLR-4799:
----------------------------------
Mikhail, This seems like a great feature, but I haven't looked at it. As I
said, I do not feel it wise to add features that won't neatly plug-in the
current DIH infrastructure until we improve the code. Really, I would love to
chop out features (Debug mode, delta updates, streaming from a POST request,
etc), and make it work independently from Solr before we build more into it.
But I've been busy with other things and haven't had much time.
By the way, have you any experience with Apache Flume? In your opinion, could
it become DIH's successor? A Solr Sink was added earlier in the year that will
index disparate data. I haven't looked much at it, but my first impression is
that it is a big, complicated tool whereas DIH is smaller and simpler and a the
2 would have different use-cases. Also, not so sure it has any support yet for
RDBMS.
> SQLEntityProcessor for zipper join
> ----------------------------------
>
> Key: SOLR-4799
> URL: https://issues.apache.org/jira/browse/SOLR-4799
> Project: Solr
> Issue Type: New Feature
> Components: contrib - DataImportHandler
> Reporter: Mikhail Khludnev
> Priority: Minor
> Labels: dih
> Attachments: SOLR-4799.patch
>
>
> DIH is mostly considered as a playground tool, and real usages end up with
> SolrJ. I want to contribute few improvements target DIH performance.
> This one provides performant approach for joining SQL Entities with miserable
> memory at contrast to
> http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityProcessor
> The idea is:
> * parent table is explicitly ordered by it’s PK in SQL
> * children table is explicitly ordered by parent_id FK in SQL
> * children entity processor joins ordered resultsets by ‘zipper’ algorithm.
> Do you think it’s worth to contribute it into DIH?
> cc: [~goksron] [~jdyer]
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]