[
https://issues.apache.org/jira/browse/SOLR-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981186#comment-13981186
]
Aaron LaBella commented on SOLR-5981:
-------------------------------------
Sure ... I did a proof of concept to use the DataImportHandler framework to
import into mongodb. I think the architecture and functionality that DIH
supports is fantastic (ie: evaluators, transformers, etc.), and the only
"import" that mongodb supports (as far as I know) is a csv.
So, I took advantage of the solr code base here to do everything that the DIH
does, ie: connect to a DB and get data, just instead of dumping the results
into the solr index, I actually create mongodb documents. Actually, my proof
of concept supports two modes: insert and copy -- the former just inserts into
mongodb and skips solr, the second will insert documents into both.
Turns out someone else had a similar idea, but, they re-wrote half the solr dih
framework:
http://code.google.com/p/sql-to-nosql-importer/
My solution only requires a small extension... I'm happy to share it with the
solr community if anyone else wants it. I think using mongodb as the document
store and solr to index just the fields of the document you want to search on
has the most potential for serious scalability.
Let me know if you have any additional questions/thoughts/comments.
Thanks.
> Please change method visibility of getSolrWriter in DataImportHandler to
> public (or at least protected)
> -------------------------------------------------------------------------------------------------------
>
> Key: SOLR-5981
> URL: https://issues.apache.org/jira/browse/SOLR-5981
> Project: Solr
> Issue Type: Improvement
> Components: contrib - DataImportHandler
> Affects Versions: 4.0
> Environment: Linux 3.13.9-200.fc20.x86_64
> Solr 4.6.0
> Reporter: Aaron LaBella
> Assignee: Shawn Heisey
> Priority: Minor
> Fix For: 4.9, 5.0
>
> Attachments: SOLR-5981.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> I've been using the org.apache.solr.handler.dataimport.DataImportHandler for
> a bit and it's an excellent model and architecture. I'd like to extend the
> usage of it to plugin my own DIHWriter, but, the code doesn't allow for it.
> Please change ~line 227 in the DataImportHander class to be:
> public SolrWriter getSolrWriter
> instead of:
> private SolrWriter getSolrWriter
> or, at a minimum, protected, so that I can extend DataImportHandler and
> override this method.
> Thank you *sincerely* in advance for the quick turn-around on this. If the
> change can be made in 4.6.0 and upstream, that'd be ideal.
> Thanks!
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]