Hello, We are currently working on this kind of repository connector for a customer. We plan to give the code to the MCF project if the customer lets us do it legally. We will know it at the end of the month or at the beginning of next month.
In order to have this working, all the fields of the target Solr need to be stored, this condition is mandatory. You can give a look to the Solr entity processor of Data Import Handler component : https://lucene.apache.org/solr/guide/8_0/uploading-structured-data-store-data-with-the-data-import-handler.html#entity-processors <https://lucene.apache.org/solr/guide/8_0/uploading-structured-data-store-data-with-the-data-import-handler.html#entity-processors>. We were inspired by that for the development of the connector. Best regards, Olivier > Le 5 août 2019 à 16:38, Furkan KAMACI <[email protected]> a écrit : > > Hi Dileepa, > > Writing a custom repository connector can let you achieve your goal. Read and > directly write to an output connector. > > You should check your requirements i.e. which data sources you will connect. > MCF may rid of huge integration pains compared to many other ETL tools in > your case. > > On the other hand, if you wanna achieve a federated search, you could search > across distributed indexes. Otherwise, it is a heteregous sourced indexing > architecture. You can federate your search query into Solr without ingesting > it to any other place. By the way, MCF will let you make document level > security, you should handle it manually in such a case. > > Kind Regards, > Furkan KAMACI > > 5 Ağu 2019 Pzt, saat 17:11 tarihinde Dileepa Jayakody > <[email protected] <mailto:[email protected]>> şunu yazdı: > Hi Karl and all, > > In my use-case, one of the data-sources is an already populated Solr index > which is an e-commerce web-site data index (customers, products & services). > Apart from the Solr Index, I need to ingest several other heterogeneous > data-sources such as PostgresSQL databases, CRM data etc into the federated > search index (the output index will either be a Solr, Elastic-search. We > haven't yet finalized on the output index, but I know that both of these are > supported in MCF as output connectors.). > > @Karl based on your comments, I would appreciate your opinion on below > ingestion flow. > Solr repository/data-source > Solr schema transformations > > Solr/Elastic-search search-index > > For such a scenario, do you think MCF is not the ideal option as the > ETL/ingestion tool? Should I go for a lower-level ETL tool such as Apache > Nifi ? > Or will writing a MCF Solr repository connector be useful to achieve this? > WDYT? > > Thanks a lot. > Regards, > Dileepa > > > > On Mon, Aug 5, 2019 at 3:40 PM Karl Wright <[email protected] > <mailto:[email protected]>> wrote: > If you are trying to extract data from a Solr index, I know of no way to do > that. > Karl > > > On Mon, Aug 5, 2019 at 9:08 AM Dileepa Jayakody <[email protected] > <mailto:[email protected]>> wrote: > Hi All, > > Thanks for your replies. > I'm looking for a repository connector. I've used the Solr output connector > before. But now what I need is to connect to a solr index as a repository and > retrieve the documents from there. So I need a Solr repository connector. > > @Karl > I will look at the Solr connector, but this is an output connect, isn't it? > Can use this as a repository connector to retrieve docs? > > Thanks, > Dileepa > > On Mon, Aug 5, 2019 at 12:45 PM Cihad Guzel <[email protected] > <mailto:[email protected]>> wrote: > Hi Dileepa, > > You can check all MFC Connectors list from > https://manifoldcf.apache.org/release/release-2.13/en_US/included-connectors.html > > <https://manifoldcf.apache.org/release/release-2.13/en_US/included-connectors.html> > > MFC have a Solr Output Connector. It is not a repository connector. if you > want to use as repository connector, you should write a new repository > connector. > > Regards, > Cihad Guzel > > > Dileepa Jayakody <[email protected] > <mailto:[email protected]>>, 5 Ağu 2019 Pzt, 13:18 tarihinde şunu > yazdı: > Hi All, > > I'm working on a project which needs to implement a federated search solution > with heterogeneous data repositories. One repository is a Solr index. I would > like to use ManifoldCF as the data ingestion engine in this project as I have > worked with MCF before. > > Does ManifoldCF has a Solr repository connector which I can use here? Or will > I need to implement a new repository connector for Solr? > Any guidance here is much appreciated. > > Thanks, > Dileepa
