[ 
https://issues.apache.org/jira/browse/SOLR-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742992#comment-13742992
 ] 

Mikhail Khludnev commented on SOLR-4799:
----------------------------------------

James,
I don't really understand. I wanted to add a tiny plugin into DIH, but 
bq. I mean, make zipperjoin an option for any entity processor as opposed to 
its own new variant on SqlE.P.
and after I went this way after heavy doubts
bq. As I said, I do not feel it wise to add features that won't neatly plug-in 
the current DIH infrastructure until we improve the code. 
Anyway, I absolutely share your concerns - DIH is a great idea, but it's worth 
to revamp an engine. I have no experience with Flume, but I consider it as some 
kind of transport. I want to look at Pentaho Kettle (kind of old school ETL 
tool), Cloudera Morphlines.
                
> SQLEntityProcessor for zipper join
> ----------------------------------
>
>                 Key: SOLR-4799
>                 URL: https://issues.apache.org/jira/browse/SOLR-4799
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>            Reporter: Mikhail Khludnev
>            Priority: Minor
>              Labels: dih
>         Attachments: SOLR-4799.patch
>
>
> DIH is mostly considered as a playground tool, and real usages end up with 
> SolrJ. I want to contribute few improvements target DIH performance.
> This one provides performant approach for joining SQL Entities with miserable 
> memory at contrast to 
> http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityProcessor  
> The idea is:
> * parent table is explicitly ordered by it’s PK in SQL
> * children table is explicitly ordered by parent_id FK in SQL
> * children entity processor joins ordered resultsets by ‘zipper’ algorithm.
> Do you think it’s worth to contribute it into DIH?
> cc: [~goksron] [~jdyer]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to