Github user mikewalch commented on a diff in the pull request:
--- Diff: docs/fluo/1.0.0-incubating/metrics.md ---
@@ -0,0 +1,117 @@
+title: Fluo Metrics
+A Fluo application can be configured (in [fluo.properties]) to report
metrics. When metrics are
+configured, Fluo will report some 'default' metrics about an application
that help users monitor its
+performance. Users can also write code to report 'application-specific'
metrics from their
+applications. Both 'application-specific' and 'default' metrics share the
same reporter configured
+by [fluo.properties] and are described in detail below.
+## Configuring reporters
+Fluo metrics are not published by default. To publish metrics, configure a
reporter in the 'metrics'
+section of [fluo.properties]. There are several different reporter types
(i.e Console, CSV,
+Graphite, JMX, SLF4J) that are implemented using [Dropwizard]. The choice
of which reporter to use
+depends on the visualization tool used. If you are not currently using a
visualization tool, there
+is [documentation][grafana] for reporting Fluo metrics to Grafana/InfluxDB.
+## Metrics names
+When Fluo metrics are reported, they are published using a naming scheme
that encodes additional
+information. This additional information is represented using all caps
variables (i.e `METRIC`)
+Default metrics start with `fluo.class` or `fluo.system` and have
following naming schemes:
+Application metrics start with `fluo.app` and have following scheme:
+The variables below describe the additional information that is encoded in
+1. `APPLICATION` - Fluo application name
+2. `REPORTER_ID` - Unique ID of the Fluo oracle, worker, or client that is
reporting the metric.
+ When running in YARN, this ID is of the format `worker-INSTANCE_ID` or
+ where `INSTANCE_ID` corresponds to instance number. When not running
in YARN, this ID consists
+ of a hostname and a base36 long that is unique across all fluo
+3. `METRIC` - Name of the metric. For 'default' metrics, this is set by
Fluo. For 'application'
+ metrics, this is set by user. Name should be unique and avoid using
period '.' in name.
+4. `CLASS` - Name of Fluo observer or loader class that produced metric.
This allows things like
+ transaction collisions to be tracked per class.
+## Application-specific metrics
+Application metrics are implemented by retrieving a [MetricsReporter] from
an [Observer], [Loader],
+or [FluoClient]. These metrics are named using the format
+## Default metrics
+Default metrics report for a particular Observer/Loader class or
+Below are metrics that are reported from each Observer/Loader class that
is configured in a Fluo
+application. These metrics are reported after each transaction and named
using the format
+* tx_lock_wait_time - [Timer]
+ - Time transaction spent waiting on locks held by other transactions.
+ - Only updated for transactions that have non-zero lock time.
+* tx_execution_time - [Timer]
+ - Time transaction took to execute.
+ - Updated for failed and successful transactions.
+ - This does not include commit time, only the time from start until
commit is called.
+* tx_with_collision - [Meter]
+ - Rate of transactions with collisions.
+* tx_collisions - [Meter]
+ - Rate of collisions.
+* tx_entries_set - [Meter]
+ - Rate of row/columns set by transaction
+* tx_entries_read - [Meter]
+ - Rate of row/columns read by transaction that existed.
+ - There is currently no count of all reads (including non-existent
+* tx_locks_timedout - [Meter]
+ - Rate of timedout locks rolled back by transaction.
+ - These are locks that are held for very long periods by another
transaction that appears to be
+ alive based on zookeeper.
+* tx_locks_dead - [Meter]
+ - Rate of dead locks rolled by a transaction.
+ - These are locks held by a process that appears to be dead according
+* tx_status_`<STATUS>` - [Meter]
+ - Rate of different ways (i.e `<STATUS>`) a transaction can terminate
+Below are system-wide metrics that are reported for the entire Fluo
application. These metrics are
+named using the format `fluo.system.APPLICATION.REPORTER_ID.METRIC`.
+* oracle_response_time - [Timer]
+ - Time each RPC call to oracle for stamps took
+* oracle_client_stamps - [Histogram]
+ - Number of stamps requested for each request for stamps to the server
+* oracle_server_stamps - [Histogram]
+ - Number of stamps requested for each request for stamps from a client
+* worker_notifications_queued - [Gauge]
+ - The current number of notifications queued for processing.
+* transactor_committing - [Gauge]
+ - The current number of transactions that are working their way
through the commit steps.
+Histograms and Timers have a counter. In the case of a histogram, the
counter is the number of times
+the metric was updated and not a sum of the updates. For example if a
request for 5 timestamps was
+made to the oracle followed by a request for 3 timestamps, then the count
+would be 2 and the mean would be (5+3)/2.
--- End diff --
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket