I should also mention that I'm trying to use the RabbitMQ river, which is 
why I'm converting files into Base64 to begin with.

Thanks again!

On Wednesday, November 19, 2014 2:22:23 PM UTC-5, 
[email protected] wrote:
>
> Bump on this. I've tried using the built in Java Base64 encoding:
> Base64.getMimeEncoder().encode(Files.readAllBytes(file.toPath()));
>
>
> And using jackson's ObjectMapper function as follows
> String base64 = mapper.writeValueAsString(Files.readAllBytes(file.toPath
> ()));
> base64 = base64.substring(1, base64.length() - 1);
>
>
> Can anyone help out out and point me where I'm going wrong?
>
> On Sunday, May 12, 2013 3:59:54 PM UTC-4, Massimiliano Perantoni wrote:
>>
>> Hi,
>> I installed elasticsearch flawlessly and started developing a mail 
>> indexing solution.
>> Dealing with the main setup everything went flawlessly, I even installed 
>> the plugin for tika document text extraction.
>> After that I wrote some simple beans to write in the system some emails 
>> after parsing using java mail.
>> When it comes to index attachments (docs, pdfs, docx, open documents, etc 
>> etc), several mails got indexed correctly, some others no.
>> I had some problems in putting direct base64 encoded documents from the 
>> email, even because when it comes to encoding, I preferred to decode the 
>> contents and reencode it, just to be sure I wrote everything correctly.
>> When I create the json file (attached to the email), I succeed even in 
>> creating the decoded document whici is readable and the payload I pass to 
>> elasticsearch is working.
>> Here are the versions:
>>
>> elasticsearch versione 0.90.0
>> elasticsearch-mapper-attachments 1.7.0
>>
>> See attached json as test document
>> Here's the mapping I used
>> curl -XGET 'http://localhost:9200/anagrafiche/email/_mapping?pretty=true'
>> {
>>   "email" : {
>>     "properties" : {
>>       "addTimestamp" : {
>>         "type" : "string"
>>       },
>>       "answered" : {
>>         "type" : "boolean"
>>       },
>>       "attacheddocument" : {
>>         "type" : "attachment",
>>         "path" : "full",
>>         "fields" : {
>>           "attacheddocument" : {
>>             "type" : "string"
>>           },
>>           "author" : {
>>             "type" : "string"
>>           },
>>           "title" : {
>>             "type" : "string"
>>           },
>>           "name" : {
>>             "type" : "string"
>>           },
>>           "date" : {
>>             "type" : "date",
>>             "format" : "dateOptionalTime"
>>           },
>>           "keywords" : {
>>             "type" : "string"
>>           },
>>           "content_type" : {
>>             "type" : "string"
>>           }
>>         }
>>       },
>>       "cgateId" : {
>>         "type" : "string"
>>       },
>>       "contents" : {
>>         "type" : "string"
>>       },
>>       "date" : {
>>         "type" : "date",
>>         "format" : "dateOptionalTime",
>>         "include_in_all" : true
>>       },
>>       "filePath" : {
>>         "type" : "string"
>>       },
>>       "from" : {
>>         "properties" : {
>>           "address" : {
>>             "type" : "string"
>>           },
>>           "encodedPersonal" : {
>>             "type" : "string",
>>             "include_in_all" : true
>>           }
>>         }
>>       },
>>       "hasattachments" : {
>>         "type" : "boolean"
>>       },
>>       "numlines" : {
>>         "type" : "long"
>>       },
>>       "recipient" : {
>>         "properties" : {
>>           "address" : {
>>             "type" : "string"
>>           },
>>           "encodedPersonal" : {
>>             "type" : "string",
>>             "include_in_all" : true
>>           }
>>         }
>>       },
>>       "seen" : {
>>         "type" : "boolean"
>>       },
>>       "subject" : {
>>         "type" : "string"
>>       }
>>     }
>>   }
>> }
>>
>> Here's the output of the indexing command attempt
>> [maxper@max ~]$ curl -XPOST 'http://localhost:9200/anagrafiche/email/' 
>> -d  @testindex.json 
>> {"error":"MapperParsingException[failed to parse]; nested: 
>> JsonParseException[Failed to decode VALUE_STRING as base64 
>> (MIME-NO-LINEFEEDS): Unexpected padding character ('=') as character #3 of 
>> 4-char base64 unit: padding only legal as 3rd or 4th character\n at 
>> [Source: [B@45387c9d; line: 1, column: 32804]]; ","status":400}
>> [maxper@max ~]$ 
>>
>> Just to be clear, I really can index some documents, so the mapping 
>> should be correct.
>>
>> I hope someone may help me :)
>> Thanks, Massimiliano
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/428f1d4b-8a66-4a75-a120-b261894b7b89%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to