Thanks, it sounds like you are treating it as an attachment, In your 
example, what is the "fileContents" in  .field("content", fileContents) ? 
How do I get file contents of an image, I know in the case of the pdf, this 
is content text of the pdf.
Correct, I don't want to index the image binary, I just need to be able to 
pull up the image when it's text field has a match.

On Thursday, February 27, 2014 8:29:25 AM UTC-5, Binh Ly wrote:
>
> You certainly can add a new field, and then just put the OCR text into 
> that new field. So for example:
>
> Mapping:
>
>         PutMappingResponse putMappingResponse = new 
> PutMappingRequestBuilder(
>             
> client.admin().indices()).setIndices(INDEX_NAME).setType(DOCUMENT_TYPE).setSource(
>                 XContentFactory.jsonBuilder().startObject()
>                     .field(DOCUMENT_TYPE).startObject()
>                         .field("properties").startObject()
>                             .field("text").startObject()
>                                 .field("type", "string")
>                             .endObject()
>                             .field("file").startObject()
>                                 .field("store", "yes")
>                                 .field("type", "attachment")
>                                 .field("fields").startObject()
>                                     .field("file").startObject()
>                                         .field("store", "yes")
>                                     .endObject()
>                                 .endObject()
>                             .endObject()
>                         .endObject()
>                     .endObject()
>                 .endObject()
>         ).execute().actionGet();
>
> Then put the OCR text into the "text" field:
>
>         IndexResponse indexResponse = client.prepareIndex(INDEX_NAME, 
> DOCUMENT_TYPE, "1")
>             .setSource(XContentFactory.jsonBuilder().startObject()
>                 .field("text", ocrText)
>                 .field("file").startObject()
>                     .field("content", fileContents)
>                     .field("_indexed_chars", -1)
>                 .endObject()
>             .endObject()
>         ).execute().actionGet();
>
> You probably don't need to index the image binary information - not sure 
> what you would need it for.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/da68600a-c2ec-4728-8461-644d4dab7b39%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to