[
https://issues.apache.org/jira/browse/HTRACE-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14244767#comment-14244767
]
Colin Patrick McCabe commented on HTRACE-18:
--------------------------------------------
Hi [~longzhouwk], this is really interesting.
Sorry if this is a dumb question, but what does Flume do with the spans once it
gets them? Does it write them to a file? That wasn't clear to me by reading
the code (perhaps I missed it, or perhaps it's implicit in how Flume works.) I
guess this could be useful in the case where your cluster is generating a short
burst of tracing activity and you want to use Flume to avoid losing any trace
spans.
One potential problem that I see is that you will need to load the trace spans
into some indexed database or storage system in order to do queries on them.
For example, to support a web UI, we need some way to do operations like "look
up the trace span which is the parent of the current span." You can't do that
on a flatfile that Flume has dumped to disk. This is why we are working on the
htrace standalone server (which stores trace spans in a LevelDB instance) and
the htrace hbase module, which sends trace spans to HBase. (HBase supports
looking up spans by index.) However, maybe we could have some way of loading
the trace spans that Flume wrote into HBase or the standalone server later on.
I think we should use JSON to serialize these trace spans, rather than Avro.
This will avoid adding all the Avro boilerplate. Plus JSON serialization is
already implemented in htrace-core... we just need to move the JSON converter
from {{LocalFileSpanReceiver.java}} into {{Span.java}} and make it public.
This avoids all the Avro boilerplate, and will make it easier to load these
trace spans into HBase or htraced later. Plus it will make them
human-readable. (We could also consider protobuf down the road, but I think
JSON is easier for now.)
I would also argue that we shouldn't use guava in {{htrace-flume}}. The reason
is that flume pulls in its own version of Guava, which may fight with our
version.
> Support flume receiver
> ----------------------
>
> Key: HTRACE-18
> URL: https://issues.apache.org/jira/browse/HTRACE-18
> Project: HTrace
> Issue Type: Improvement
> Reporter: Long Zhou
> Attachments: htrace-flume01.patch
>
>
> Hi htrace devs,
> I have been using htrace for a while and find it very useful.
> I needed a way to collect traces from remote servers via flume, so I
> implemented the flume receiver (patch attached). If this code is useful to
> other users, I would like to contribute it to the project.
> Please kindly review the patch, and let me know if anything I should
> fix/improve.
> Thanks,
> Long Zhou
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)