Re: NiFi PutElasticsearch Processor

2017-02-20 Thread Matt Burgess
Shanka,

The Fetch/PutElasticsearch processors are built to be part of the ES cluster, 
and IIRC Elasticsearch says that this should be compatible against dot releases 
for a particular major/minor version, so I think ours are built against 2.1.x. 
These might work with ES 2.2.0 but they do not "guarantee" it. Likewise the ES 
5 processors are built with ES 5.0.1 so they should work with 5.0.x and most 
likely won't work with an ES 2.x cluster.

There is a set of HTTP processors (Fetch/PutElasticsearchHttp for example) that 
are more robust in terms of which versions of ES clusters they support, as 
these processors use the more stable REST API versus the more volatile (but 
more performant) native transport API.

Regards,
Matt


> On Feb 20, 2017, at 9:03 AM, shankhamajumdar <shankha.majum...@lexmark.com> 
> wrote:
> 
> Hi Mark,
> 
> I have resolved json attribute issue by increasing the value of Maximum
> Capture Group Length in AttributesToJSON processor. 
> 
> I have one more question - For PutElasticsearch processor I am using
> elasticsearch2.2.0 version. Is it possible to use elasticsearch5 version for
> PutElasticsearch processor?
> 
> Regards,
> Shankha
> 
> 
> 
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/NiFi-PutElasticsearch-Processor-tp14733p14822.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: NiFi PutElasticsearch Processor

2017-02-20 Thread shankhamajumdar
Hi Mark,

I have resolved json attribute issue by increasing the value of Maximum
Capture Group Length in AttributesToJSON processor. 

I have one more question - For PutElasticsearch processor I am using
elasticsearch2.2.0 version. Is it possible to use elasticsearch5 version for
PutElasticsearch processor?

Regards,
Shankha



--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/NiFi-PutElasticsearch-Processor-tp14733p14822.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: NiFi PutElasticsearch Processor

2017-02-20 Thread shankhamajumdar
Thanks Mark, your solution has worked! I am facing one more issue. I am
trying to put the entire content in a single json attribute using
AttributesToJSON processor. It's working fine but that particular attribute
is not able to capture the entire content, it's able to capture around first
7 lines of the entire content. Is there any limitation on that or how to to
resolve this issue?

Regards,
Shankha




--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/NiFi-PutElasticsearch-Processor-tp14733p14820.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: NiFi PutElasticsearch Processor

2017-02-17 Thread Mark Payne
Shankha,

With Java Regex'es, by default, the dot-character does not match newlines. So 
(.+) will only match
a single line. On Extract Text, you can change the property named "Enable 
DOTALL Mode" to true,
which should allow the .+ to capture all of the text in the FlowFile.

Thanks
-Mark



> On Feb 17, 2017, at 5:18 AM, shankhamajumdar <shankha.majum...@lexmark.com> 
> wrote:
> 
> Hi,
> 
> I have added ExtractText processor and there added a new property called
> myAttribute with value (.+). Then added AttributesToJSON processor with
> Attributes List as myAttribute. As result I am getting the below json
> structure.
> 
> {"myAttribute":"test elasticsearch"}
> 
> But it's not working for multiline content as the json attribute is taking
> single line only. To resolve this I to keep the entire content in a single
> line. So I have added ReplaceText processor before AttributesToJSON
> processor. In the replace text processor I am trying to replace \n to empty
> space so that entire content can come in a single line. 
> 
> Can you please tell me how to make the entire content in a single line using
> ReplaceText? I have used search value as \n and replacement value as ' '.
> But this is not working properly.
> 
> Please provide some input on this.
> 
> Regards,
> Shankha
> 
> 
> 
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/NiFi-PutElasticsearch-Processor-tp14733p14774.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.



Re: NiFi PutElasticsearch Processor

2017-02-17 Thread shankhamajumdar
Hi,

I have added ExtractText processor and there added a new property called
myAttribute with value (.+). Then added AttributesToJSON processor with
Attributes List as myAttribute. As result I am getting the below json
structure.

{"myAttribute":"test elasticsearch"}

But it's not working for multiline content as the json attribute is taking
single line only. To resolve this I to keep the entire content in a single
line. So I have added ReplaceText processor before AttributesToJSON
processor. In the replace text processor I am trying to replace \n to empty
space so that entire content can come in a single line. 

Can you please tell me how to make the entire content in a single line using
ReplaceText? I have used search value as \n and replacement value as ' '.
But this is not working properly.

Please provide some input on this.

Regards,
Shankha



--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/NiFi-PutElasticsearch-Processor-tp14733p14774.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: NiFi PutElasticsearch Processor

2017-02-14 Thread Matt Burgess
I agree with Joe, it sounds like the flow file content is the
plain-text extract of the PDF, and the ES processors are expecting
JSON documents. You can use ReplaceText to wrap your content in a JSON
object, with something like:

{ "content" : $1}

(I didn't try that but the idea is just to put the original text into
a field inside a JSON object)

Alternatively as Joe mentioned, you may want to consider offering the
user a choice of output format (Text, JSON, etc.) in your custom
processor.

Regards,
Matt

On Tue, Feb 14, 2017 at 8:27 AM, Joe Witt <joe.w...@gmail.com> wrote:
> Hello
>
> Is the code available to provide feedback on?  When you extract aspects
> from the PDF where are you putting the extracted values?  You prob want to
> convert the PDF content into the extracted results in json.  You could also
> extract aspects of the PDF into flowfile attributes and then make json in
> another processor but this won't scale as well for large documents/extracts.
>
> Thanks
> Joe
>
> On Feb 14, 2017 7:46 AM, "shankhamajumdar" <shankha.majum...@lexmark.com>
> wrote:
>
> Hi,
>
> I am working on a use case where I need to load a PDF document to
> ElasticSearch. I have written a Custom NiFi processor using apache Tika
> which is basically extracting the content of the PDF. The NiFi flow is
> mentioned below.
>
> 1. NiFi GetFile processor is getting the PDF file from the source directory.
>
> 2. NiFi custom processor which is written using apache Tika is extracting
> the PDF file.
>
> 2. Using NiFi PutElasticsearch processor to load the data in ElasticSearch.
> But I am getting the below error.
>
> MapperParsingException[failed to parse]; nested:
> NotXContentException[Compressor detection can only
> be called on some xcontent bytes or compressed xcontent bytes];
>
> Regards,
> Shankha
>
>
>
> --
> View this message in context: http://apache-nifi-developer-
> list.39713.n7.nabble.com/NiFi-PutElasticsearch-Processor-tp14733.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


NiFi PutElasticsearch Processor

2017-02-14 Thread shankhamajumdar
Hi,

I am working on a use case where I need to load a PDF document to
ElasticSearch. I have written a Custom NiFi processor using apache Tika
which is basically extracting the content of the PDF. The NiFi flow is
mentioned below.

1. NiFi GetFile processor is getting the PDF file from the source directory.

2. NiFi custom processor which is written using apache Tika is extracting
the PDF file.

2. Using NiFi PutElasticsearch processor to load the data in ElasticSearch.
But I am getting the below error.

MapperParsingException[failed to parse]; nested:
NotXContentException[Compressor detection can only
be called on some xcontent bytes or compressed xcontent bytes];

Regards,
Shankha



--
View this message in context: 
http://apache-nifi-developer-list.39713.n7.nabble.com/NiFi-PutElasticsearch-Processor-tp14733.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.