Re: HDF NIfi - Does Nifi writes provenance/data on HDP Node ?

2017-06-15 Thread Bryan Bende
Hi Shashi,

This list is more about Apache NiFi and is not really specific to any
vendor distributions.

That being said, whatever node NiFi is running on, it will be using
local disk to store the internal repositories (flow file, content,
provenance).

When communicating with HDFS through the PutHDFS processor, NiFi is
reading data from it's content repository and sending it to the data
nodes of the HDFS cluster, the same as if you installed the HDFS
command line client and wrote a file to HDFS.

-Bryan


On Thu, Jun 15, 2017 at 9:51 AM, Shashi Vishwakarma
 wrote:
> Hi Koji
>
> I am trying to evaluate HDF NIfi from security perspective. I am trying to
> make sure when HDF Nifi talks to HDP , it does not leak/spill  any kind of
> information on HDP data nodes (i.e. on local disk). I am fine if it is
> writing it on HDFS.
>
>
>
>
> On Thu, Jun 15, 2017 at 2:35 AM, Koji Kawamura 
> wrote:
>>
>> Hi Shashi,
>>
>> Sorry for delayed response. I am not aware that NiFi writes any
>> provenance information on HDP nodes. But if your goal is to expose
>> NiFi provenance data to HDFS, Hive (or Spark) to analyze provenance
>> data using those services, then SiteToSiteProvenanceReportingTask
>> might be helpful.
>>
>> SiteToSiteProvenanceReportingTask can sends provenance events in JSON
>> format. You can send it to a NiFi input port then pass those into HDFS
>> by PutHDFS processor.
>>
>> If not, would you elaborate what you are trying to accomplish?
>>
>> Thanks,
>> Koji
>>
>> On Mon, Jun 12, 2017 at 6:25 AM, Shashi Vishwakarma
>>  wrote:
>> > Hi
>> >
>> > I have HDF cluster with 3 Nifi instance which lunches jobs(Hive/Spark)
>> > on
>> > HDP cluster. Usually nifi writes all information to different
>> > repositories
>> > available on local machine.
>> >
>> > My question is - Does nifi writes any data,provenance information or
>> > does
>> > spilling on HDP nodes (ex. data nodes in HDP cluster) while accessing
>> > HDFS,Hive or Spark services ?
>> >
>> > Thanks
>> >
>> > Shashi
>
>


Re: HDF NIfi - Does Nifi writes provenance/data on HDP Node ?

2017-06-15 Thread Shashi Vishwakarma
Hi Koji

I am trying to evaluate HDF NIfi from security perspective. I am trying to
make sure when HDF Nifi talks to HDP , it does not leak/spill  any kind of
information on HDP data nodes (i.e. on local disk). I am fine if it is
writing it on HDFS.




On Thu, Jun 15, 2017 at 2:35 AM, Koji Kawamura 
wrote:

> Hi Shashi,
>
> Sorry for delayed response. I am not aware that NiFi writes any
> provenance information on HDP nodes. But if your goal is to expose
> NiFi provenance data to HDFS, Hive (or Spark) to analyze provenance
> data using those services, then SiteToSiteProvenanceReportingTask
> might be helpful.
>
> SiteToSiteProvenanceReportingTask can sends provenance events in JSON
> format. You can send it to a NiFi input port then pass those into HDFS
> by PutHDFS processor.
>
> If not, would you elaborate what you are trying to accomplish?
>
> Thanks,
> Koji
>
> On Mon, Jun 12, 2017 at 6:25 AM, Shashi Vishwakarma
>  wrote:
> > Hi
> >
> > I have HDF cluster with 3 Nifi instance which lunches jobs(Hive/Spark) on
> > HDP cluster. Usually nifi writes all information to different
> repositories
> > available on local machine.
> >
> > My question is - Does nifi writes any data,provenance information or does
> > spilling on HDP nodes (ex. data nodes in HDP cluster) while accessing
> > HDFS,Hive or Spark services ?
> >
> > Thanks
> >
> > Shashi
>


Re: HDF NIfi - Does Nifi writes provenance/data on HDP Node ?

2017-06-14 Thread Koji Kawamura
Hi Shashi,

Sorry for delayed response. I am not aware that NiFi writes any
provenance information on HDP nodes. But if your goal is to expose
NiFi provenance data to HDFS, Hive (or Spark) to analyze provenance
data using those services, then SiteToSiteProvenanceReportingTask
might be helpful.

SiteToSiteProvenanceReportingTask can sends provenance events in JSON
format. You can send it to a NiFi input port then pass those into HDFS
by PutHDFS processor.

If not, would you elaborate what you are trying to accomplish?

Thanks,
Koji

On Mon, Jun 12, 2017 at 6:25 AM, Shashi Vishwakarma
 wrote:
> Hi
>
> I have HDF cluster with 3 Nifi instance which lunches jobs(Hive/Spark) on
> HDP cluster. Usually nifi writes all information to different repositories
> available on local machine.
>
> My question is - Does nifi writes any data,provenance information or does
> spilling on HDP nodes (ex. data nodes in HDP cluster) while accessing
> HDFS,Hive or Spark services ?
>
> Thanks
>
> Shashi


HDF NIfi - Does Nifi writes provenance/data on HDP Node ?

2017-06-11 Thread Shashi Vishwakarma
Hi

I have HDF cluster with 3 Nifi instance which lunches jobs(Hive/Spark) on
HDP cluster. Usually nifi writes all information to different repositories
available on local machine.

My question is - Does nifi writes any data,provenance information or does
spilling on HDP nodes (ex. data nodes in HDP cluster) while accessing
HDFS,Hive or Spark services ?

Thanks

Shashi