Thanks all! I succeed.

发件人: Andrew Grande [mailto:[email protected]]
发送时间: 2018年3月23日 21:13
收件人: [email protected]
主题: Re: 答复: put pictures from remote server into hdfs

I think the MOB expectation for HBase was around 10MB.

I agree it will require some thought put in organizing the space and region 
server splits with column families, once this volume becomes significant.

Andrew
On Fri, Mar 23, 2018, 9:08 AM Mike Thomsen 
<[email protected]<mailto:[email protected]>> wrote:
Off the top of my head, try PutHBaseCell for that. If you run into problems, 
let us know.

As a side note, you should be careful about storing large binary blobs in 
HBase. I don't know to what extent our processors support HBase MOBs either. In 
general, you'll probably be alright if the pictures are on the small side (< 
1MB), but be very careful beyond that.

If you have to store a lot of images and aren't able to commit to a small file 
size, I would recommend looking at BLOB store like S3 or OpenStack Swift. Maybe 
Ceph as well.

On Thu, Mar 22, 2018 at 8:59 PM, 李 磊 
<[email protected]<mailto:[email protected]>> wrote:
Hi Bryan:

Thanks for you response.

Using GetSFTP and PutHDFS is helpful.

Now I meet another problem. Besides the HDFS, the priictures from remote server 
also need to put into HBase. The filename is rowkey and the file as a column.

This is the reason why I store the pictures in local and then use 
ExecuteFlumeSource with spooldir which can read the picture as a whole, but I 
lose the filename.

-----邮件原件-----
发件人: Bryan Bende [mailto:[email protected]<mailto:[email protected]>]
发送时间: 2018年3月23日 0:42
收件人: [email protected]<mailto:[email protected]>
主题: Re: put pictures from remote server into hdfs

Hello,

It would probably be best to use GetSFTP -> PutHDFS.

No need to write the files out to local disk somewhere else with PutFile, they 
can go straight to HDFS.

The filename in HDFS will be the "filename" attribute of the flow file, which 
GetSFTP should be setting to the filename it picked up.

If you need a different filename, you can stick an UpdateAttribute before 
PutHDFS and change the filename attribute to whatever makes sense.

-Bryan


On Thu, Mar 22, 2018 at 12:18 PM, 李 磊 
<[email protected]<mailto:[email protected]>> wrote:
> Hi all,
>
>
>
> It is my requirement that put pictures from remote server(not in nifi
> cluster) into hdfs.
>
> First I use the GetSFTP and PutFile to get pictures to local, and then
> use ExecuteFlumeSource and ExecuteFlumeSink to put pictures into hdfs
> from local.
>
>
>
> However, there is a problem that the name of pictures that put into
> hdfs cannot keep the same with local.
>
>
>
> Could you tell me the way to keep the name same or a better way to put
> pictures into hdfs from remote server with nifi?
>
>
>
> Thanks!

Reply via email to