I think the MOB expectation for HBase was around 10MB. I agree it will require some thought put in organizing the space and region server splits with column families, once this volume becomes significant.
Andrew On Fri, Mar 23, 2018, 9:08 AM Mike Thomsen <[email protected]> wrote: > Off the top of my head, try PutHBaseCell for that. If you run into > problems, let us know. > > As a side note, you should be careful about storing large binary blobs in > HBase. I don't know to what extent our processors support HBase MOBs > either. In general, you'll probably be alright if the pictures are on the > small side (< 1MB), but be very careful beyond that. > > If you have to store a lot of images and aren't able to commit to a small > file size, I would recommend looking at BLOB store like S3 or OpenStack > Swift. Maybe Ceph as well. > > On Thu, Mar 22, 2018 at 8:59 PM, 李 磊 <[email protected]> wrote: > >> Hi Bryan: >> >> Thanks for you response. >> >> Using GetSFTP and PutHDFS is helpful. >> >> Now I meet another problem. Besides the HDFS, the priictures from remote >> server also need to put into HBase. The filename is rowkey and the file as >> a column. >> >> This is the reason why I store the pictures in local and then use >> ExecuteFlumeSource with spooldir which can read the picture as a whole, but >> I lose the filename. >> >> -----邮件原件----- >> 发件人: Bryan Bende [mailto:[email protected]] >> 发送时间: 2018年3月23日 0:42 >> 收件人: [email protected] >> 主题: Re: put pictures from remote server into hdfs >> >> Hello, >> >> It would probably be best to use GetSFTP -> PutHDFS. >> >> No need to write the files out to local disk somewhere else with PutFile, >> they can go straight to HDFS. >> >> The filename in HDFS will be the "filename" attribute of the flow file, >> which GetSFTP should be setting to the filename it picked up. >> >> If you need a different filename, you can stick an UpdateAttribute before >> PutHDFS and change the filename attribute to whatever makes sense. >> >> -Bryan >> >> >> On Thu, Mar 22, 2018 at 12:18 PM, 李 磊 <[email protected]> wrote: >> > Hi all, >> > >> > >> > >> > It is my requirement that put pictures from remote server(not in nifi >> > cluster) into hdfs. >> > >> > First I use the GetSFTP and PutFile to get pictures to local, and then >> > use ExecuteFlumeSource and ExecuteFlumeSink to put pictures into hdfs >> > from local. >> > >> > >> > >> > However, there is a problem that the name of pictures that put into >> > hdfs cannot keep the same with local. >> > >> > >> > >> > Could you tell me the way to keep the name same or a better way to put >> > pictures into hdfs from remote server with nifi? >> > >> > >> > >> > Thanks! >> > >
