[
https://issues.apache.org/jira/browse/SAMZA-300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062346#comment-14062346
]
Chris Riccomini commented on SAMZA-300:
---------------------------------------
bq. do we want to provide the out-of-box solution for consuming the metrics
stream
Yea, my opinion is that we should do two things: 1) expose intra-job
(job-level) metrics in the YARN AM 2) expose inter-job (topology-level) metrics
in a standalone dashboard.
bq. do we want to provide more output formats for metrics, such as CVS, STDOUT,
Graphite
Yes, we should open some tickets for the relevant ones. All three you list are
useful. Ganglia also, probably.
bq. Am curious what was the reason of not using Yammer Metrics at the
beginning? I see it's actually in our dependencies (from Kafka? I guess).
We intentionally excluded it. Coda's metrics library was breaking backwards
compatibility frequently, which was causing a ton of problems for us. We
essentially just copied their metrics 2.0 API, but kept it inside Samza's code
base. The philosophy was that metrics are so fundamental to the system that we
wanted complete control over it. We didn't apply this logic to logging because
log4j was (and still is) far more stable than any metrics library out there
(that I know of). Most of the Yammer metrics v2 plugins can be pretty trivially
adapted to our API.
> Track producers and consumers of streams
> ----------------------------------------
>
> Key: SAMZA-300
> URL: https://issues.apache.org/jira/browse/SAMZA-300
> Project: Samza
> Issue Type: New Feature
> Reporter: Martin Kleppmann
>
> Each Samza job runs independently, which has a lot of advantages. However,
> there are situations in which it would be valuable to have a global overview
> of the data flows between jobs. For example:
> - It's important for correctness that only one job ever publishes to a given
> checkpoint or changelog stream — if several jobs publish to the same stream,
> the result is nonsensical. However, we currently have no way of enforcing
> that. It would be good if a job could take a "write lock" on a stream, and
> thus prevent others from writing to it.
> - It would be awesome to have a dashboard/visualization that graphically
> shows the job graph, and visually highlights the health of a job (e.g.
> whether a job is fallen behind).
> - The job graph would also be generally useful for tracking data provenance
> (finding consumers who would be affected by a schema change, finding the team
> that is responsible for producing a particular stream, etc)
> - Potentially could include additional metadata about streams, e.g. owner,
> serialization format, schema, documentation of semantics of the data, etc.
> (HCatalog for streams?)
> One possibility would be for Kafka to add some of this functionality,
> although it may also make sense to implement it in Samza (that way it would
> be available for non-Kafka systems as well, and could use knowledge about the
> job that Samza has, but Kafka hasn't).
> This is just a vague description to start a discussion. Please comment with
> your ideas on how to best implement this.
--
This message was sent by Atlassian JIRA
(v6.2#6252)