I think there is no distinct limitation at the number of files one can
write to at the same time. Because each write stream is out to
corresponding DataNodes which are different most likely. So it's like the
MapReduce output directly stored as seperate file in HDFS which is no
distinct limitation at the number of files concurrently write.

2012/8/7 Nguyen Manh Tien <tien.nguyenm...@gmail.com>

> @Yanbo, Alex: I want to dev a custom module to write directly to HDFS.
> Collector in flume aggregate log from many source and write into few file.
> So if i want to write to many file (for example one for each source), i
> want to know how many file we can open in that case.
>
> Thanks.
> Tien
>
>
> On Mon, Aug 6, 2012 at 9:58 PM, Alex Baranau <alex.barano...@gmail.com>wrote:
>
>> Also interested in this question.
>>
>> @Yanbo: while we could use third-party tools to import/gather data into
>> HDFS, I guess here is the intention to write data to HDFS directly. It
>> would be great to hear what are the "sensible" limitations on number of
>> files one can write to at the same time.
>>
>> Thank you in advance,
>>
>> Alex Baranau
>> ------
>> Sematext :: http://sematext.com/ :: Hadoop - HBase - ElasticSearch - Solr
>>
>> On Mon, Aug 6, 2012 at 2:14 AM, Yanbo Liang <yanboha...@gmail.com> wrote:
>>
>>> You can use scribe or flume to collect log data and integrated with
>>> hadoop.
>>>
>>>
>>> 2012/8/4 Nguyen Manh Tien <tien.nguyenm...@gmail.com>
>>>
>>>> Hi,
>>>> I plan to streaming logs data HDFS using many writer, each writer write
>>>> a stream of data to a HDFS file (may rotate)
>>>>
>>>> I wonder how many concurrent writer i should use?
>>>> And if you have that experience please share to me : hadoop cluster
>>>> size, number of writer, replication.
>>>>
>>>> Thanks.
>>>> Tien
>>>>
>>>
>>>
>>
>>
>> --
>> Alex Baranau
>> ------
>> Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch
>> - Solr
>>
>>
>

Reply via email to