[ 
https://issues.apache.org/jira/browse/SOLR-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060175#comment-14060175
 ] 

Uwe Schindler commented on SOLR-6016:
-------------------------------------

The schemaless mode is useful for one reason: To automatically generate a 
schema/mapping that you can later modify! But for that it would be better to 
have a "schema learning" mode in Solr, where Solr does *not* index documents: 
You just pass a bunch of documents to solr and after you have done this (yes: 
AFTER) it creates the "best" schema for you and returns it. By that the user 
has best flexibility: The system analyzes *many* documents and finds the 
schema, which can be used as the basis for your own schema.

The problem with the current schemaless mode is the fact, that it defines 
fields by the first occurence. The problem is visible with double fields 
without a colon. If the first document looks like an integer it creates an 
integer. In my proposal, Solr would look at like 10 or 100 documents and then 
decide for a schema. Much more user-friendly.

The schema-learning mode would have been a better feature than schemaless mode 
in Elasticsearch. By just copying the functionality we gained nothing 
(actually, it was a step back in my opinion) and now have the same problems all 
first-time users of Elasticsearch have.

> Failure indexing exampledocs with example-schemaless mode
> ---------------------------------------------------------
>
>                 Key: SOLR-6016
>                 URL: https://issues.apache.org/jira/browse/SOLR-6016
>             Project: Solr
>          Issue Type: Bug
>          Components: documentation, Schema and Analysis
>    Affects Versions: 4.7.2, 4.8
>            Reporter: Shalin Shekhar Mangar
>         Attachments: SOLR-6016.patch, solr.log
>
>
> Steps to reproduce:
> # cd example; java -Dsolr.solr.home=example-schemaless/solr -jar start.jar
> # cd exampledocs; java -jar post.jar *.xml
> Output from post.jar
> {code}
> Posting files to base url http://localhost:8983/solr/update using 
> content-type application/xml..
> POSTing file gb18030-example.xml
> POSTing file hd.xml
> POSTing file ipod_other.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8983/solr/update
> POSTing file ipod_video.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8983/solr/update
> POSTing file manufacturers.xml
> POSTing file mem.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8983/solr/update
> POSTing file money.xml
> POSTing file monitor2.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8983/solr/update
> POSTing file monitor.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8983/solr/update
> POSTing file mp500.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8983/solr/update
> POSTing file sd500.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8983/solr/update
> POSTing file solr.xml
> POSTing file utf8-example.xml
> POSTing file vidcard.xml
> SimplePostTool: WARNING: Solr returned an error #400 Bad Request
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8983/solr/update
> 14 files indexed.
> COMMITting Solr index changes to http://localhost:8983/solr/update..
> Time spent: 0:00:00.401
> {code}
> Exceptions in Solr (I am pasting just one of them):
> {code}
> 5105 [qtp697879466-14] ERROR org.apache.solr.core.SolrCore  – 
> org.apache.solr.common.SolrException: ERROR: [doc=EN7800GTX/2DHTV/256M] Error 
> adding field 'price'='479.95' msg=For input string: "479.95"
>       at 
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:167)
>       at 
> org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:77)
>       at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:234)
>       at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:160)
>       at 
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
>       at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
> ......
> Caused by: java.lang.NumberFormatException: For input string: "479.95"
>       at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>       at java.lang.Long.parseLong(Long.java:441)
>       at java.lang.Long.parseLong(Long.java:483)
>       at org.apache.solr.schema.TrieField.createField(TrieField.java:609)
>       at org.apache.solr.schema.TrieField.createFields(TrieField.java:660)
> {code}
> The full solr.log is attached.
> I understand why these errors occur but since we ship example data with Solr 
> to demonstrate our core features, I expect that indexing exampledocs should 
> work without errors.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to