[ 
https://issues.apache.org/jira/browse/SOLR-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060934#comment-14060934
 ] 

Hoss Man commented on SOLR-6016:
--------------------------------

bq. We could add a SchemaGeneratorHandler which would generate the "best" 
schema.

You wouldn't need/want a handler for this -- you'd just need an 
UpdateProcessorFactory to use in place of RunUpdateProcessorFactory that would 
look at the datatpes of the fields in each document w/o doing any indexing and 
pick the least common denominator.

So then you'd have a chain with all of your normal update processors including 
the TypeMapping processors configured with the preccedence orders and locales 
and format strings you want -- and at the end you'd have your 
BestFitScheamGeneratorUpdateProcessorFactory that would look at all those docs, 
study their values, and throw them away -- until a {{commit}} comes along, at 
which point it does all the under the hood schema field addition calls.

So do learn, you'd send docs using whatever handler/format you wnat (json, xml, 
extraction, etc...) with an 
{{update.chain=my.datatype.learning.processor.chain}} request param ... and 
once you've sent a bunch and giving it a lot of variety to see, then you send a 
commit so it creates the schema and then you re-index your docs for real w/o 
that special chain.

Varun: want to open a new issue for this idea? ... it's realted but independent 
to the current issue which might have other tweaks/improvements on it's own.

> Failure indexing exampledocs with example-schemaless mode
> ---------------------------------------------------------
>
>                 Key: SOLR-6016
>                 URL: https://issues.apache.org/jira/browse/SOLR-6016
>             Project: Solr
>          Issue Type: Bug
>          Components: documentation, Schema and Analysis
>    Affects Versions: 4.7.2, 4.8
>            Reporter: Shalin Shekhar Mangar
>         Attachments: SOLR-6016.patch, solr.log
>
>
> Steps to reproduce:
> # cd example; java -Dsolr.solr.home=example-schemaless/solr -jar start.jar
> # cd exampledocs; java -jar post.jar *.xml
> Output from post.jar
> {code}
> Posting files to base url http://localhost:8983/solr/update using 
> content-type application/xml..
> POSTing file gb18030-example.xml
> POSTing file hd.xml
> POSTing file ipod_other.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8983/solr/update
> POSTing file ipod_video.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8983/solr/update
> POSTing file manufacturers.xml
> POSTing file mem.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8983/solr/update
> POSTing file money.xml
> POSTing file monitor2.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8983/solr/update
> POSTing file monitor.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8983/solr/update
> POSTing file mp500.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8983/solr/update
> POSTing file sd500.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8983/solr/update
> POSTing file solr.xml
> POSTing file utf8-example.xml
> POSTing file vidcard.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8983/solr/update
> 14 files indexed.
> COMMITting Solr index changes to http://localhost:8983/solr/update..
> Time spent: 0:00:00.401
> {code}
> Exceptions in Solr (I am pasting just one of them):
> {code}
> 5105 [qtp697879466-14] ERROR org.apache.solr.core.SolrCore  – 
> org.apache.solr.common.SolrException: ERROR: [doc=EN7800GTX/2DHTV/256M] Error 
> adding field 'price'='479.95' msg=For input string: "479.95"
>       at 
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:167)
>       at 
> org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:77)
>       at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:234)
>       at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:160)
>       at 
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
>       at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
> ......
> Caused by: java.lang.NumberFormatException: For input string: "479.95"
>       at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>       at java.lang.Long.parseLong(Long.java:441)
>       at java.lang.Long.parseLong(Long.java:483)
>       at org.apache.solr.schema.TrieField.createField(TrieField.java:609)
>       at org.apache.solr.schema.TrieField.createFields(TrieField.java:660)
> {code}
> The full solr.log is attached.
> I understand why these errors occur but since we ship example data with Solr 
> to demonstrate our core features, I expect that indexing exampledocs should 
> work without errors.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to