[
https://issues.apache.org/jira/browse/CHUKWA-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764748#action_12764748
]
Eric Yang commented on CHUKWA-22:
---------------------------------
Chukwa's demux processor already ordered the data, hence 95% of the time, it
should be sequential write to hbase. My test machines also have 16GB of RAM.
Hence, I am not seeing the memory and throughput problems yet. Maybe my
dataset is too small when writing to hbase. The paper is a interesting read.
Thanks for sharing. I am open to suggestion on indexing chukwa data. Perhaps,
the data could be managed using tfiles, yet this would make chukwa to repeat a
lot of work from hbase. That is something that I would like to avoid.
Something to think about.
> Need index for chukwa sequence files
> ------------------------------------
>
> Key: CHUKWA-22
> URL: https://issues.apache.org/jira/browse/CHUKWA-22
> Project: Hadoop Chukwa
> Issue Type: New Feature
> Components: Data Processors
> Environment: Redhat EL 5.1 and Java 6
> Reporter: Eric Yang
> Assignee: Eric Yang
>
> Chukwa has ability to collect large volume of data, but the lack of index
> prevents Chukwa front end to serve data straight from HDFS. This jira is the
> place holder for designing a indexing service for Chukwa. The plan is to
> create indexing service base on available software like lucene or katta.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.