[ 
https://issues.apache.org/jira/browse/SOLR-5981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981186#comment-13981186
 ] 

Aaron LaBella commented on SOLR-5981:
-------------------------------------

Sure ... I did a proof of concept to use the DataImportHandler framework to 
import into mongodb.  I think the architecture and functionality that DIH 
supports is fantastic (ie: evaluators, transformers, etc.), and the only 
"import" that mongodb supports (as far as I know) is a csv.

So, I took advantage of the solr code base here to do everything that the DIH 
does, ie: connect to a DB and get data, just instead of dumping the results 
into the solr index, I actually create mongodb documents.  Actually, my proof 
of concept supports two modes: insert and copy -- the former just inserts into 
mongodb and skips solr, the second will insert documents into both.

Turns out someone else had a similar idea, but, they re-wrote half the solr dih 
framework:
http://code.google.com/p/sql-to-nosql-importer/

My solution only requires a small extension... I'm happy to share it with the 
solr community if anyone else wants it.  I think using mongodb as the document 
store and solr to index just the fields of the document you want to search on 
has the most potential for serious scalability.

Let me know if you have any additional questions/thoughts/comments.

Thanks.

> Please change method visibility of getSolrWriter in DataImportHandler to 
> public (or at least protected)
> -------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-5981
>                 URL: https://issues.apache.org/jira/browse/SOLR-5981
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 4.0
>         Environment: Linux 3.13.9-200.fc20.x86_64
> Solr 4.6.0
>            Reporter: Aaron LaBella
>            Assignee: Shawn Heisey
>            Priority: Minor
>             Fix For: 4.9, 5.0
>
>         Attachments: SOLR-5981.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> I've been using the org.apache.solr.handler.dataimport.DataImportHandler for 
> a bit and it's an excellent model and architecture.  I'd like to extend the 
> usage of it to plugin my own DIHWriter, but, the code doesn't allow for it.  
> Please change ~line 227 in the DataImportHander class to be:
> public SolrWriter getSolrWriter
> instead of:
> private SolrWriter getSolrWriter
> or, at a minimum, protected, so that I can extend DataImportHandler and 
> override this method.
> Thank you *sincerely* in advance for the quick turn-around on this.  If the 
> change can be made in 4.6.0 and upstream, that'd be ideal.
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to