[ 
https://issues.apache.org/jira/browse/SOLR-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove updated SOLR-9525:
------------------------------
    Attachment: SOLR-9525.patch

Full implementation and tests for a split operation. Because it's implementing 
as an operation this will work as part of a select(....) stream.

Valid expression forms:

{code}
split(fieldA, on=",") // replace value of fieldA with List<String> of split 
values
split(fieldA, on=",", as="fieldB") // splits value of fieldA into List<String> 
and puts into fieldB
{code}

> split() function for streaming
> ------------------------------
>
>                 Key: SOLR-9525
>                 URL: https://issues.apache.org/jira/browse/SOLR-9525
>             Project: Solr
>          Issue Type: Wish
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Mike Thomsen
>         Attachments: SOLR-9525.patch
>
>
> This is the original description I posted on solr-user:
> Read this article and thought it could be interesting as a way to do 
> ingestion:
> https://dzone.com/articles/solr-streaming-expressions-for-collection-auto-upd-1
> Example from the article:
> daemon(id="12345",
>  runInterval="60000",
>  update(users,
>  batchSize=10,
>  jdbc(connection="jdbc:mysql://localhost/users?user=root&password=solr", 
> sql="SELECT id, name FROM users", sort="id asc", 
> driver="com.mysql.jdbc.Driver")
> )
> What's the best way to handle a multivalue field using this API? Is there a 
> way to tokenize something returned in a database field?
> Joel Bernstein responded with this:
> Unfortunately there currently isn't a way to split a field. But this would
> be nice functionality to add.
> The approach would be to an add a split operation that would be used by the
> select() function. It would look like this:
> select(jdbc(...), split(fieldA, delim=","), ...)
> This would make a good jira issue.
> So the TL;DR version is that I need the ability to specify in such a 
> streaming operation certain fields to tokenize into multivalue fields. In one 
> schema I may have to support, there are probably a half a dozen such fields.
> Perhaps I am missing a feature here, but until this is done it looks like 
> this new capability cannot handle multivalue fields until something like this 
> is in place.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to