Mike Thomsen created SOLR-9525:

             Summary: split() function for streaming
                 Key: SOLR-9525
                 URL: https://issues.apache.org/jira/browse/SOLR-9525
             Project: Solr
          Issue Type: Wish
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: Mike Thomsen

This is the original description I posted on solr-user:

Read this article and thought it could be interesting as a way to do ingestion:


Example from the article:





sql="SELECT id, name FROM users", sort="id asc", driver="com.mysql.jdbc.Driver")


What's the best way to handle a multivalue field using this API? Is there a way 
to tokenize something returned in a database field?

Joel Bernstein responded with this:

Unfortunately there currently isn't a way to split a field. But this would
be nice functionality to add.

The approach would be to an add a split operation that would be used by the
select() function. It would look like this:

select(jdbc(...), split(fieldA, delim=","), ...)

This would make a good jira issue.

So the TL;DR version is that I need the ability to specify in such a streaming 
operation certain fields to tokenize into multivalue fields. In one schema I 
may have to support, there are probably a half a dozen such fields.

Perhaps I am missing a feature here, but until this is done it looks like this 
new capability cannot handle multivalue fields until something like this is in 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to