Hello,
I am in an odd position. The application server I use has built-in integration with SOLR. Unfortunately, its native capabilities are fairly limited, specifically, it only supports a standard/pre-defined set of fields which can be indexed. As a result, it has left me kludging how I work with Solr and doing things like putting what I'd like to be multiple, separate fields into a single Solr field. As an example, I may put a customer id and name into a single field called 'custom1'. Ideally, I'd like this information to be returned in separate fields...and even better would be for them to be indexed as separate fields but I can live without the latter. Currently, I'm building out a json representation of this information which makes it easy for me to deal with when I extract the results...but it all feels wrong. I do have complete control over the actual Solr installation (just not the indexing call to Solr), so I was hoping there may be a way to configure Solr to take my single field and split it up into a different field for each key in my json representation. I don't see anything native to Solr that would do this for me but there are a few features that I thought sounded similar and was hoping to get some opinions on how I may be able to move forward with this... Poly fields, such as the spatial location, might help? Can I build my own poly-field that would split up the main field into subfields? Do poly-fields let me return the subfields? I don't quite have my head around polyfields yet. Another option although I suspect this won't be considered a good approach, but what about extending the copyField functionality of schema.xml to support my needs? It would seem not entirely unreasonable that copyField would provide a means to extract only a portion of the contents of the source field to place in the destination field, no? I'm sure people more familiar with Solr's architecture could explain why this isn't really an appropriate thing for Solr to handle (just because it could doesn't mean it should)... The other - and probably best -- option would be to leverage Solr directly, bypassing the native integration of my application server, which we've already done for most cases. I'd love to go this route but I'm having a hard time figuring out how to "easily" accomplish the same functionality provided by my app server integration...perhaps someone on the list could help me with this path forward? Here is what I'm trying to accomplish: I'm indexing documents (text, pdf, html...) but I need to include fields in the results of my searches which are only available from a db query. I know how to have Solr index results from a db query, but I'm having trouble getting it to index the documents that are associated to each record of that query (full path/filename is one of the fields of that query). I started to try to use the dataImport handler to do this, by setting up a FileDataSource in addition to my jdbc data source. I tried to leverage the filedatasource to populate a sub-entity based on the db field that contains the full path/filename, but I wasn't sure how to specify the db field from the root query/entity. Before I spent too much time, I also realized I wasn't sure how to get Solr to deal with binary file types this way either which upon further reading seemed like I would need to leverage Tika - can that be done within the confines of dataimporthandler? Any advice is greatly appreciated. Thanks in advance, Joe