http://git-wip-us.apache.org/repos/asf/metron/blob/cf4c2ecd/current-book/metron-platform/metron-enrichment/index.html ---------------------------------------------------------------------- diff --git a/current-book/metron-platform/metron-enrichment/index.html b/current-book/metron-platform/metron-enrichment/index.html index 33615dd..771f646 100644 --- a/current-book/metron-platform/metron-enrichment/index.html +++ b/current-book/metron-platform/metron-enrichment/index.html @@ -1,13 +1,13 @@ <!DOCTYPE html> <!-- - | Generated by Apache Maven Doxia at 2017-02-23 + | Generated by Apache Maven Doxia at 2017-06-27 | Rendered using Apache Maven Fluido Skin 1.3.0 --> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta charset="UTF-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> - <meta name="Date-Revision-yyyymmdd" content="20170223" /> + <meta name="Date-Revision-yyyymmdd" content="20170627" /> <meta http-equiv="Content-Language" content="en" /> <title>Metron – Enrichment</title> <link rel="stylesheet" href="../../css/apache-maven-fluido-1.3.0.min.css" /> @@ -30,14 +30,11 @@ <div class="container-fluid"> <div id="banner"> <div class="pull-left"> - <a href="http://metron.incubator.apache.org/" id="bannerLeft"> - <img src="../../images/metron-logo.png" alt="Apache Metron - Incubating" width="148px" height="48px"/> + <a href="http://metron.apache.org/" id="bannerLeft"> + <img src="../../images/metron-logo.png" alt="Apache Metron" width="148px" height="48px"/> </a> </div> - <div class="pull-right"> <a href="http://incubator.apache.org/" id="bannerRight"> - <img src="../../images/ApacheIncubating_Logo.png" alt="Apache Incubating" width="192px" height="48px"/> - </a> - </div> + <div class="pull-right"> </div> <div class="clear"><hr/></div> </div> @@ -51,8 +48,8 @@ </li> <li class="divider ">/</li> <li class=""> - <a href="http://metron.incubator.apache.org/" class="externalLink" title="Metron-Incubating"> - Metron-Incubating</a> + <a href="http://metron.apache.org/" class="externalLink" title="Metron"> + Metron</a> </li> <li class="divider ">/</li> <li class=""> @@ -64,8 +61,8 @@ - <li id="publishDate" class="pull-right">Last Published: 2017-02-23</li> <li class="divider pull-right">|</li> - <li id="projectVersion" class="pull-right">Version: 0.3.1</li> + <li id="publishDate" class="pull-right">Last Published: 2017-06-27</li> <li class="divider pull-right">|</li> + <li id="projectVersion" class="pull-right">Version: 0.4.0</li> </ul> </div> @@ -78,7 +75,7 @@ <ul class="nav nav-list"> <li class="nav-header">User Documentation</li> - + <li> <a href="../../index.html" title="Metron"> @@ -99,7 +96,7 @@ <i class="icon-chevron-right"></i> Analytics</a> </li> - + <li> <a href="../../metron-deployment/index.html" title="Deployment"> @@ -113,7 +110,21 @@ <i class="none"></i> Docker</a> </li> - + + <li> + + <a href="../../metron-interface/metron-config/index.html" title="Config"> + <i class="none"></i> + Config</a> + </li> + + <li> + + <a href="../../metron-interface/metron-rest/index.html" title="Rest"> + <i class="none"></i> + Rest</a> + </li> + <li> <a href="../../metron-platform/index.html" title="Platform"> @@ -127,13 +138,13 @@ <i class="none"></i> Api</a> </li> - + <li> <a href="../../metron-platform/metron-common/index.html" title="Common"> - <i class="none"></i> + <i class="icon-chevron-right"></i> Common</a> - </li> + </li> <li> @@ -174,9 +185,16 @@ <i class="none"></i> Pcap-backend</a> </li> + + <li> + + <a href="../../metron-platform/metron-writer/index.html" title="Writer"> + <i class="none"></i> + Writer</a> + </li> </ul> </li> - + <li> <a href="../../metron-sensors/index.html" title="Sensors"> @@ -286,7 +304,7 @@ </tbody> </table> <p>The <tt>config</tt> map is intended to house enrichment specific configuration. For instance, for the <tt>hbaseEnrichment</tt>, the mappings between the enrichment types to the column families is specified.</p> -<p>The <tt>fieldMap</tt>contents are of interest because they contain the routing and configuration information for the enrichments. When we say ‘routing’, we mean how the messages get split up and sent to the enrichment adapter bolts. The simplest, by far, is just providing a simple list as in</p> +<p>The <tt>fieldMap</tt>contents are of interest because they contain the routing and configuration information for the enrichments.<br />When we say ‘routing’, we mean how the messages get split up and sent to the enrichment adapter bolts.<br />The simplest, by far, is just providing a simple list as in</p> <div class="source"> <div class="source"> @@ -305,7 +323,11 @@ ] } </pre></div></div> -<p>Based on this sample config, both ip_src_addr and ip_dst_addr will go to the <tt>geo</tt>, <tt>host</tt>, and <tt>hbaseEnrichment</tt> adapter bolts. For the <tt>geo</tt>, <tt>host</tt> and <tt>hbaseEnrichment</tt>, this is sufficient. However, more complex enrichments may contain their own configuration. Currently, the <tt>stellar</tt> enrichment requires a more complex configuration, such as:</p> +<p>Based on this sample config, both <tt>ip_src_addr</tt> and <tt>ip_dst_addr</tt> will go to the <tt>geo</tt>, <tt>host</tt>, and <tt>hbaseEnrichment</tt> adapter bolts. </p> +<div class="section"> +<h4><a name="Stellar_Enrichment_Configuration"></a>Stellar Enrichment Configuration</h4> +<p>For the <tt>geo</tt>, <tt>host</tt> and <tt>hbaseEnrichment</tt>, this is sufficient. However, more complex enrichments may contain their own configuration. Currently, the <tt>stellar</tt> enrichment is more adaptable and thus requires a more nuanced configuration.</p> +<p>At its most basic, we want to take a message and apply a couple of enrichments, such as converting the <tt>hostname</tt> field to lowercase. We do this by specifying the transformation inside of the <tt>config</tt> for the <tt>stellar</tt> fieldMap. There are two syntaxes that are supported, specifying the transformations as a map with the key as the field and the value the stellar expression:</p> <div class="source"> <div class="source"> @@ -313,34 +335,78 @@ ... "stellar" : { "config" : { - "numeric" : { - "foo": "1 + 1" - } - ,"ALL_CAPS" : "TO_UPPER(source.type)" + "hostname" : "TO_LOWER(hostname)" } } } </pre></div></div> -<p>Whereas the simpler enrichments just need a set of fields explicitly stated so they can be separated from the message and sent to the enrichment adapter bolt for enrichment and ultimately joined back in the join bolt, the stellar enrichment has its set of required fields implicitly stated through usage. For instance, if your stellar statement references a field, it should be included and if not, then it should not be included. We did not want to require users to make explicit the implicit.</p> -<p>The other way in which the stellar enrichment is somewhat more complex is in how the statements are executed. In the general purpose case for a list of fields, those fields are used to create a message to send to the enrichment adapter bolt and that bolt’s worker will handle the fields one by one in serial for a given message. For stellar enrichment, we wanted to have a more complex design so that users could specify the groups of stellar statements sent to the same worker in the same message (and thus executed sequentially). Consider the following configuration:</p> +<p>Another approach is to make the transformations as a list with the same <tt>var := expr</tt> syntax as is used in the Stellar REPL, such as:</p> + +<div class="source"> +<div class="source"> +<pre> "fieldMap": { + ... + "stellar" : { + "config" : [ + "hostname := TO_LOWER(hostname)" + ] + } + } +</pre></div></div> +<p>Sometimes arbitrary stellar enrichments may take enough time that you would prefer to split some of them into groups and execute the groups of stellar enrichments in parallel. Take, for instance, if you wanted to do an HBase enrichment and a profiler call which were independent of one another. This usecase is supported by splitting the enrichments up as groups.</p> +<p>Consider the following example:</p> <div class="source"> <div class="source"> <pre> "fieldMap": { + ... "stellar" : { "config" : { - "numeric" : { - "foo": "1 + 1" - "bar" : TO_LOWER(source.type)" - } - ,"text" : { - "ALL_CAPS" : "TO_UPPER(source.type)" - } + "malicious_domain_enrichment" : { + "is_bad_domain" : "ENRICHMENT_EXISTS('malicious_domains', ip_dst_addr, 'enrichments', 'cf')" + }, + "login_profile" : [ + "profile_window := PROFILE_WINDOW('from 6 months ago')", + "global_login_profile := PROFILE_GET('distinct_login_attempts', 'global', profile_window)", + "stats := STATS_MERGE(global_login_profile)", + "auth_attempts_median := STATS_PERCENTILE(stats, 0.5)", + "auth_attempts_sd := STATS_SD(stats)", + "profile_window := null", + "global_login_profile := null", + "stats := null" + ] } } } </pre></div></div> -<p>We have a group called <tt>numeric</tt> whose stellar statements will be executed sequentially. In parallel to that, we have the group of stellar statements under the group <tt>text</tt> executing. The intent here is to allow you to not force higher latency operations to be done sequentially. You can use any name for your groupings you like. Be aware that the configuration is a map and duplicate configuration keys’ values are not combined, so the duplicate configuration value will be overwritten.</p></div> +<p>Here we want to perform two enrichments that hit HBase and we would rather not run in sequence. These enrichments are entirely independent of one another (i.e. neither relies on the output of the other). In this case, we’ve created a group called <tt>malicious_domain_enrichment</tt> to inquire about whether the destination address exists in the HBase enrichment table in the <tt>malicious_domains</tt> enrichment type. This is a simple enrichment, so we can express the enrichment group as a map with the new field <tt>is_bad_domain</tt> being a key and the stellar expression associated with that operation being the associated value.</p> +<p>In contrast, the stellar enrichment group <tt>login_profile</tt> is interacting with the profiler, has multiple temporary expressions (i.e. <tt>profile_window</tt>, <tt>global_login_profile</tt>, and <tt>stats</tt>) that are useful only within the context of this group of stellar expressions. In this case, we would need to ensure that we use the list construct when specifying the group and remember to set the temporary variables to <tt>null</tt> so they are not passed along.</p> +<p>In general, things to note from this section are as follows:</p> + +<ul> + +<li>The stellar enrichments for the <tt>stellar</tt> enrichment adapter are specified in the <tt>config</tt> for the <tt>stellar</tt> enrichment adapter in the <tt>fieldMap</tt></li> + +<li>Groups of independent (i.e. no expression in any group depend on the output of an expression from an other group) may be executed in parallel</li> + +<li>If you have the need to use temporary variables, you may use the list construct. Ensure that you assign the variables to <tt>null</tt> before the end of the group.</li> + +<li><b>Ensure that you do not assign a field to a stellar expression which returns an object which JSON cannot represent.</b></li> + +<li>Fields assigned to Maps as part of stellar enrichments have the maps unfolded, similar to the HBase Enrichment + +<ul> + +<li>For example the stellar enrichment for field <tt>foo</tt> which assigns a map such as <tt>foo := { 'grok' : 1, 'bar' : 'baz'}</tt> would yield the following fields: + +<ul> + +<li><tt>foo.grok</tt> == <tt>1</tt></li> + +<li><tt>foo.bar</tt> == <tt>'baz'</tt></li> + </ul></li> + </ul></li> +</ul></div></div> <div class="section"> <h3><a name="The_threatIntel_Configuration"></a>The <tt>threatIntel</tt> Configuration</h3> @@ -395,7 +461,7 @@ </tr> </tbody> </table> -<p>The <tt>config</tt> map is intended to house threat intel specific configuration. For instance, for the <tt>hbaseThreatIntel</tt> threat intel adapter, the mappings between the enrichment types to the column families is specified.</p> +<p>The <tt>config</tt> map is intended to house threat intel specific configuration. For instance, for the <tt>hbaseThreatIntel</tt> threat intel adapter, the mappings between the enrichment types to the column families is specified. The <tt>fieldMap</tt> configuration is similar to the <tt>enrichment</tt> configuration in that the adapters available are the same.</p> <p>The <tt>triageConfig</tt> field is also a complex field and it bears some description:</p> <table border="0" class="table table-striped"> @@ -442,6 +508,8 @@ <li><tt>rule</tt> : The rule, represented as a Stellar statement</li> <li><tt>score</tt> : Associated threat triage score for the rule</li> + +<li><tt>reason</tt> : Reason the rule tripped. Can be represented as a Stellar statement</li> </ul> <p>An example of a rule is as follows:</p> @@ -451,8 +519,9 @@ { "name" : "is internal" , "comment" : "determines if the destination is internal." - , rule" : "IN_SUBNET(ip_dst_addr, '192.168.0.0/24')" - , "score" : 10 + , "rule" : "IN_SUBNET(ip_dst_addr, '192.168.0.0/24')" + , "score" : 10 + , "reason" : "FORMAT('%s is internal', ip_dst_addr)" } ] </pre></div></div> @@ -657,8 +726,9 @@ <footer> <div class="container-fluid"> - <div class="row span12">Copyright © 2017. - All Rights Reserved. + <div class="row span12">Copyright © 2017 + <a href="https://www.apache.org">The Apache Software Foundation</a>. + All Rights Reserved. </div>
http://git-wip-us.apache.org/repos/asf/metron/blob/cf4c2ecd/current-book/metron-platform/metron-indexing/index.html ---------------------------------------------------------------------- diff --git a/current-book/metron-platform/metron-indexing/index.html b/current-book/metron-platform/metron-indexing/index.html index 1f5d0cf..febd70e 100644 --- a/current-book/metron-platform/metron-indexing/index.html +++ b/current-book/metron-platform/metron-indexing/index.html @@ -1,13 +1,13 @@ <!DOCTYPE html> <!-- - | Generated by Apache Maven Doxia at 2017-02-23 + | Generated by Apache Maven Doxia at 2017-06-27 | Rendered using Apache Maven Fluido Skin 1.3.0 --> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta charset="UTF-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> - <meta name="Date-Revision-yyyymmdd" content="20170223" /> + <meta name="Date-Revision-yyyymmdd" content="20170627" /> <meta http-equiv="Content-Language" content="en" /> <title>Metron – Indexing</title> <link rel="stylesheet" href="../../css/apache-maven-fluido-1.3.0.min.css" /> @@ -30,14 +30,11 @@ <div class="container-fluid"> <div id="banner"> <div class="pull-left"> - <a href="http://metron.incubator.apache.org/" id="bannerLeft"> - <img src="../../images/metron-logo.png" alt="Apache Metron - Incubating" width="148px" height="48px"/> + <a href="http://metron.apache.org/" id="bannerLeft"> + <img src="../../images/metron-logo.png" alt="Apache Metron" width="148px" height="48px"/> </a> </div> - <div class="pull-right"> <a href="http://incubator.apache.org/" id="bannerRight"> - <img src="../../images/ApacheIncubating_Logo.png" alt="Apache Incubating" width="192px" height="48px"/> - </a> - </div> + <div class="pull-right"> </div> <div class="clear"><hr/></div> </div> @@ -51,8 +48,8 @@ </li> <li class="divider ">/</li> <li class=""> - <a href="http://metron.incubator.apache.org/" class="externalLink" title="Metron-Incubating"> - Metron-Incubating</a> + <a href="http://metron.apache.org/" class="externalLink" title="Metron"> + Metron</a> </li> <li class="divider ">/</li> <li class=""> @@ -64,8 +61,8 @@ - <li id="publishDate" class="pull-right">Last Published: 2017-02-23</li> <li class="divider pull-right">|</li> - <li id="projectVersion" class="pull-right">Version: 0.3.1</li> + <li id="publishDate" class="pull-right">Last Published: 2017-06-27</li> <li class="divider pull-right">|</li> + <li id="projectVersion" class="pull-right">Version: 0.4.0</li> </ul> </div> @@ -78,7 +75,7 @@ <ul class="nav nav-list"> <li class="nav-header">User Documentation</li> - + <li> <a href="../../index.html" title="Metron"> @@ -99,7 +96,7 @@ <i class="icon-chevron-right"></i> Analytics</a> </li> - + <li> <a href="../../metron-deployment/index.html" title="Deployment"> @@ -113,7 +110,21 @@ <i class="none"></i> Docker</a> </li> - + + <li> + + <a href="../../metron-interface/metron-config/index.html" title="Config"> + <i class="none"></i> + Config</a> + </li> + + <li> + + <a href="../../metron-interface/metron-rest/index.html" title="Rest"> + <i class="none"></i> + Rest</a> + </li> + <li> <a href="../../metron-platform/index.html" title="Platform"> @@ -127,13 +138,13 @@ <i class="none"></i> Api</a> </li> - + <li> <a href="../../metron-platform/metron-common/index.html" title="Common"> - <i class="none"></i> + <i class="icon-chevron-right"></i> Common</a> - </li> + </li> <li> @@ -174,9 +185,16 @@ <i class="none"></i> Pcap-backend</a> </li> + + <li> + + <a href="../../metron-platform/metron-writer/index.html" title="Writer"> + <i class="none"></i> + Writer</a> + </li> </ul> </li> - + <li> <a href="../../metron-sensors/index.html" title="Sensors"> @@ -232,7 +250,7 @@ <li>An indexing bolt configured to write to HDFS under <tt>/apps/metron/enrichment/indexed</tt></li> </ul> -<p>Errors during indexing are sent to a kafka queue called <tt>index_errors</tt></p></div> +<p>By default, errors during indexing are sent back into the <tt>indexing</tt> kafka queue so that they can be indexed and archived.</p></div> <div class="section"> <h2><a name="Sensor_Indexing_Configuration"></a>Sensor Indexing Configuration</h2> <p>The sensor specific configuration is intended to configure the indexing used for a given sensor type (e.g. <tt>snort</tt>). </p> @@ -384,7 +402,7 @@ <p>The <tt>indexing</tt> kafka queue is a collection point from the enrichment topology. As such, make sure that the number of partitions in the kafka topic is sufficient to handle the throughput that you expect.</p></div> <div class="section"> <h2><a name="Indexing_Topology"></a>Indexing Topology</h2> -<p>The enrichment topology as started by the <tt>$METRON_HOME/bin/start_elasticsearch_topology.sh</tt> or <tt>$METRON_HOME/bin/start_solr_topology.sh</tt> script uses a default of one executor per bolt. In a real production system, this should be customized by modifying the flux file in <tt>$METRON_HOME/flux/indexing/remote.yaml</tt>. </p> +<p>The <tt>indexing</tt> topology as started by the <tt>$METRON_HOME/bin/start_elasticsearch_topology.sh</tt> or <tt>$METRON_HOME/bin/start_solr_topology.sh</tt> script uses a default of one executor per bolt. In a real production system, this should be customized by modifying the flux file in <tt>$METRON_HOME/flux/indexing/remote.yaml</tt>. </p> <ul> @@ -415,8 +433,9 @@ <footer> <div class="container-fluid"> - <div class="row span12">Copyright © 2017. - All Rights Reserved. + <div class="row span12">Copyright © 2017 + <a href="https://www.apache.org">The Apache Software Foundation</a>. + All Rights Reserved. </div> http://git-wip-us.apache.org/repos/asf/metron/blob/cf4c2ecd/current-book/metron-platform/metron-management/index.html ---------------------------------------------------------------------- diff --git a/current-book/metron-platform/metron-management/index.html b/current-book/metron-platform/metron-management/index.html index c8d1d13..6efed91 100644 --- a/current-book/metron-platform/metron-management/index.html +++ b/current-book/metron-platform/metron-management/index.html @@ -1,13 +1,13 @@ <!DOCTYPE html> <!-- - | Generated by Apache Maven Doxia at 2017-02-23 + | Generated by Apache Maven Doxia at 2017-06-27 | Rendered using Apache Maven Fluido Skin 1.3.0 --> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta charset="UTF-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> - <meta name="Date-Revision-yyyymmdd" content="20170223" /> + <meta name="Date-Revision-yyyymmdd" content="20170627" /> <meta http-equiv="Content-Language" content="en" /> <title>Metron – Stellar REPL Management Utilities</title> <link rel="stylesheet" href="../../css/apache-maven-fluido-1.3.0.min.css" /> @@ -30,14 +30,11 @@ <div class="container-fluid"> <div id="banner"> <div class="pull-left"> - <a href="http://metron.incubator.apache.org/" id="bannerLeft"> - <img src="../../images/metron-logo.png" alt="Apache Metron - Incubating" width="148px" height="48px"/> + <a href="http://metron.apache.org/" id="bannerLeft"> + <img src="../../images/metron-logo.png" alt="Apache Metron" width="148px" height="48px"/> </a> </div> - <div class="pull-right"> <a href="http://incubator.apache.org/" id="bannerRight"> - <img src="../../images/ApacheIncubating_Logo.png" alt="Apache Incubating" width="192px" height="48px"/> - </a> - </div> + <div class="pull-right"> </div> <div class="clear"><hr/></div> </div> @@ -51,8 +48,8 @@ </li> <li class="divider ">/</li> <li class=""> - <a href="http://metron.incubator.apache.org/" class="externalLink" title="Metron-Incubating"> - Metron-Incubating</a> + <a href="http://metron.apache.org/" class="externalLink" title="Metron"> + Metron</a> </li> <li class="divider ">/</li> <li class=""> @@ -64,8 +61,8 @@ - <li id="publishDate" class="pull-right">Last Published: 2017-02-23</li> <li class="divider pull-right">|</li> - <li id="projectVersion" class="pull-right">Version: 0.3.1</li> + <li id="publishDate" class="pull-right">Last Published: 2017-06-27</li> <li class="divider pull-right">|</li> + <li id="projectVersion" class="pull-right">Version: 0.4.0</li> </ul> </div> @@ -78,7 +75,7 @@ <ul class="nav nav-list"> <li class="nav-header">User Documentation</li> - + <li> <a href="../../index.html" title="Metron"> @@ -99,7 +96,7 @@ <i class="icon-chevron-right"></i> Analytics</a> </li> - + <li> <a href="../../metron-deployment/index.html" title="Deployment"> @@ -113,7 +110,21 @@ <i class="none"></i> Docker</a> </li> - + + <li> + + <a href="../../metron-interface/metron-config/index.html" title="Config"> + <i class="none"></i> + Config</a> + </li> + + <li> + + <a href="../../metron-interface/metron-rest/index.html" title="Rest"> + <i class="none"></i> + Rest</a> + </li> + <li> <a href="../../metron-platform/index.html" title="Platform"> @@ -127,13 +138,13 @@ <i class="none"></i> Api</a> </li> - + <li> <a href="../../metron-platform/metron-common/index.html" title="Common"> - <i class="none"></i> + <i class="icon-chevron-right"></i> Common</a> - </li> + </li> <li> @@ -174,9 +185,16 @@ <i class="none"></i> Pcap-backend</a> </li> + + <li> + + <a href="../../metron-platform/metron-writer/index.html" title="Writer"> + <i class="none"></i> + Writer</a> + </li> </ul> </li> - + <li> <a href="../../metron-sensors/index.html" title="Sensors"> @@ -674,7 +692,7 @@ <li>writer - The writer to update (e.g. elasticsearch, solr or hdfs)</li> -<li>size - batch size (integer)</li> +<li>size - batch size (integer), defaults to 1, meaning batching disabled</li> </ul></li> <li>Returns: The String representation of the config in zookeeper</li> @@ -1589,8 +1607,9 @@ SION('is_both') ] ) <footer> <div class="container-fluid"> - <div class="row span12">Copyright © 2017. - All Rights Reserved. + <div class="row span12">Copyright © 2017 + <a href="https://www.apache.org">The Apache Software Foundation</a>. + All Rights Reserved. </div> http://git-wip-us.apache.org/repos/asf/metron/blob/cf4c2ecd/current-book/metron-platform/metron-parsers/index.html ---------------------------------------------------------------------- diff --git a/current-book/metron-platform/metron-parsers/index.html b/current-book/metron-platform/metron-parsers/index.html index 10bbd15..f7d13a6 100644 --- a/current-book/metron-platform/metron-parsers/index.html +++ b/current-book/metron-platform/metron-parsers/index.html @@ -1,13 +1,13 @@ <!DOCTYPE html> <!-- - | Generated by Apache Maven Doxia at 2017-02-23 + | Generated by Apache Maven Doxia at 2017-06-27 | Rendered using Apache Maven Fluido Skin 1.3.0 --> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta charset="UTF-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> - <meta name="Date-Revision-yyyymmdd" content="20170223" /> + <meta name="Date-Revision-yyyymmdd" content="20170627" /> <meta http-equiv="Content-Language" content="en" /> <title>Metron – Parsers</title> <link rel="stylesheet" href="../../css/apache-maven-fluido-1.3.0.min.css" /> @@ -30,14 +30,11 @@ <div class="container-fluid"> <div id="banner"> <div class="pull-left"> - <a href="http://metron.incubator.apache.org/" id="bannerLeft"> - <img src="../../images/metron-logo.png" alt="Apache Metron - Incubating" width="148px" height="48px"/> + <a href="http://metron.apache.org/" id="bannerLeft"> + <img src="../../images/metron-logo.png" alt="Apache Metron" width="148px" height="48px"/> </a> </div> - <div class="pull-right"> <a href="http://incubator.apache.org/" id="bannerRight"> - <img src="../../images/ApacheIncubating_Logo.png" alt="Apache Incubating" width="192px" height="48px"/> - </a> - </div> + <div class="pull-right"> </div> <div class="clear"><hr/></div> </div> @@ -51,8 +48,8 @@ </li> <li class="divider ">/</li> <li class=""> - <a href="http://metron.incubator.apache.org/" class="externalLink" title="Metron-Incubating"> - Metron-Incubating</a> + <a href="http://metron.apache.org/" class="externalLink" title="Metron"> + Metron</a> </li> <li class="divider ">/</li> <li class=""> @@ -64,8 +61,8 @@ - <li id="publishDate" class="pull-right">Last Published: 2017-02-23</li> <li class="divider pull-right">|</li> - <li id="projectVersion" class="pull-right">Version: 0.3.1</li> + <li id="publishDate" class="pull-right">Last Published: 2017-06-27</li> <li class="divider pull-right">|</li> + <li id="projectVersion" class="pull-right">Version: 0.4.0</li> </ul> </div> @@ -78,7 +75,7 @@ <ul class="nav nav-list"> <li class="nav-header">User Documentation</li> - + <li> <a href="../../index.html" title="Metron"> @@ -99,7 +96,7 @@ <i class="icon-chevron-right"></i> Analytics</a> </li> - + <li> <a href="../../metron-deployment/index.html" title="Deployment"> @@ -113,7 +110,21 @@ <i class="none"></i> Docker</a> </li> - + + <li> + + <a href="../../metron-interface/metron-config/index.html" title="Config"> + <i class="none"></i> + Config</a> + </li> + + <li> + + <a href="../../metron-interface/metron-rest/index.html" title="Rest"> + <i class="none"></i> + Rest</a> + </li> + <li> <a href="../../metron-platform/index.html" title="Platform"> @@ -127,13 +138,13 @@ <i class="none"></i> Api</a> </li> - + <li> <a href="../../metron-platform/metron-common/index.html" title="Common"> - <i class="none"></i> + <i class="icon-chevron-right"></i> Common</a> - </li> + </li> <li> @@ -174,9 +185,16 @@ <i class="none"></i> Pcap-backend</a> </li> + + <li> + + <a href="../../metron-platform/metron-writer/index.html" title="Writer"> + <i class="none"></i> + Writer</a> + </li> </ul> </li> - + <li> <a href="../../metron-sensors/index.html" title="Sensors"> @@ -252,7 +270,7 @@ <div class="section"> <h2><a name="Parser_Architecture"></a>Parser Architecture</h2> <p><img src="../../images/parser_arch.png" alt="Architecture" /></p> -<p>Data flows through the parser bolt via kafka and into the <tt>enrichments</tt> topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an <tt>error</tt> queue. Invalid messages as determined by global validation functions are sent to an <tt>invalid</tt> queue. </p></div> +<p>Data flows through the parser bolt via kafka and into the <tt>enrichments</tt> topology in kafka. Errors are collected with the context of the error (e.g. stacktrace) and original message causing the error and sent to an <tt>error</tt> queue. Invalid messages as determined by global validation functions are also treated as errors and sent to an <tt>error</tt> queue. </p></div> <div class="section"> <h2><a name="Message_Format"></a>Message Format</h2> <p>All Metron messages follow a specific format in order to ingest a message. If a message does not conform to this format it will be dropped and put onto an error queue for further examination. The message must be of a JSON format and must have a JSON tag message like so:</p> @@ -260,7 +278,6 @@ <div class="source"> <div class="source"> <pre>{"message" : message content} - </pre></div></div> <p>Where appropriate there is also a standardization around the 5-tuple JSON fields. This is done so the topology correlation engine further down stream can correlate messages from different topologies by these fields. We are currently working on expanding the message standardization beyond these fields, but this feature is not yet availabe. The standard field names are as follows:</p> @@ -295,7 +312,6 @@ "original_string": xxx, "additional-field 1": xxx, } - } </pre></div></div></div> <div class="section"> @@ -537,9 +553,6 @@ HH:mm:ss', MAP_GET(dc, dc2tz, 'UTC') )" -ewp,--error_writer_p <PARALLELISM_HINT> Error Writer Parallelism Hint -h,--help This screen - -iwnt,--invalid_writer_num_tasks <NUM_TASKS> Invalid Writer Num Tasks - -iwp,--invalid_writer_p <PARALLELISM_HINT> Invalid Message Writer - Parallelism Hint -k,--kafka <BROKER_URL> Kafka Broker URL -mt,--message_timeout <TIMEOUT_IN_SECS> Message Timeout in Seconds -mtp,--max_task_parallelism <MAX_TASK> Max task parallelism @@ -560,34 +573,35 @@ HH:mm:ss', MAP_GET(dc, dc2tz, 'UTC') )" <ul> -<li>retryDelayMaxMs</li> +<li><tt>spout.pollTimeoutMs</tt> - Specifies the time, in milliseconds, spent waiting in poll if data is not available. Default is 2s</li> -<li>retryDelayMultiplier</li> +<li><tt>spout.firstPollOffsetStrategy</tt> - Sets the offset used by the Kafka spout in the first poll to Kafka broker upon process start. One of -<li>retryInitialDelayMs</li> - -<li>stateUpdateIntervalMs</li> - -<li>bufferSizeBytes</li> - -<li>fetchMaxWait</li> +<ul> + +<li><tt>EARLIEST</tt></li> + +<li><tt>LATEST</tt></li> + +<li><tt>UNCOMMITTED_EARLIEST</tt> - Last uncommitted and if offsets aren’t found, defaults to earliest. NOTE: This is the default.</li> + +<li><tt>UNCOMMITTED_LATEST</tt> - Last uncommitted and if offsets aren’t found, defaults to latest.</li> + </ul></li> -<li>fetchSizeBytes</li> +<li><tt>spout.offsetCommitPeriodMs</tt> - Specifies the period, in milliseconds, the offset commit task is periodically called. Default is 15s.</li> -<li>maxOffsetBehind</li> +<li><tt>spout.maxUncommittedOffsets</tt> - Defines the max number of polled offsets (records) that can be pending commit, before another poll can take place. Once this limit is reached, no more offsets (records) can be polled until the next successful commit(s) sets the number of pending offsets bellow the threshold. The default is 10,000,000.</li> -<li>metricsTimeBucketSizeInSecs</li> +<li><tt>spout.maxRetries</tt> - Defines the max number of retrials in case of tuple failure. The default is to retry forever, which means that no new records are committed until the previous polled records have been acked. This guarantees at once delivery of all the previously polled records. By specifying a finite value for maxRetries, the user decides to sacrifice guarantee of delivery for the previous polled records in favor of processing more records.</li> -<li>socketTimeoutMs</li> +<li>Any of the configs in the Consumer API for <a class="externalLink" href="http://kafka.apache.org/0100/documentation.html#newconsumerconfigs">Kafka 0.10.x</a></li> </ul> -<p>These are described in some detail <a class="externalLink" href="https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_storm-user-guide/content/storm-kafka-api-ref.html">here</a>.</p> -<p>For instance, creating a JSON file which will set the <tt>bufferSizeBytes</tt> to 2MB and <tt>retryDelayMaxMs</tt> to 2000 would look like</p> +<p>For instance, creating a JSON file which will set the offsets to <tt>UNCOMMITTED_EARLIEST</tt></p> <div class="source"> <div class="source"> <pre>{ - "bufferSizeBytes" : 2000000, - "retryDelayMaxMs" : 2000 + "spout.firstPollOffsetStrategy" : "UNCOMMITTED_EARLIEST" } </pre></div></div> <p>This would be loaded by passing the file as argument to <tt>--extra_kafka_spout_config</tt></p></div> @@ -654,15 +668,6 @@ HH:mm:ss', MAP_GET(dc, dc2tz, 'UTC') )" <li><tt>--error_writer_p</tt> : The parallelism hint for the error writer bolt</li> </ul></li> - -<li>The Invalid Message Writer Bolt - -<ul> - -<li><tt>--invalid_writer_num_tasks</tt> : The number of tasks for the error writer bolt</li> - -<li><tt>--invalid_writer_p</tt> : The parallelism hint for the error writer bolt</li> - </ul></li> </ul> <p>Finally, if workers and executors are new to you, the following might be of use to you:</p> @@ -678,8 +683,9 @@ HH:mm:ss', MAP_GET(dc, dc2tz, 'UTC') )" <footer> <div class="container-fluid"> - <div class="row span12">Copyright © 2017. - All Rights Reserved. + <div class="row span12">Copyright © 2017 + <a href="https://www.apache.org">The Apache Software Foundation</a>. + All Rights Reserved. </div> http://git-wip-us.apache.org/repos/asf/metron/blob/cf4c2ecd/current-book/metron-platform/metron-pcap-backend/index.html ---------------------------------------------------------------------- diff --git a/current-book/metron-platform/metron-pcap-backend/index.html b/current-book/metron-platform/metron-pcap-backend/index.html index a36dc50..af673d5 100644 --- a/current-book/metron-platform/metron-pcap-backend/index.html +++ b/current-book/metron-platform/metron-pcap-backend/index.html @@ -1,13 +1,13 @@ <!DOCTYPE html> <!-- - | Generated by Apache Maven Doxia at 2017-02-23 + | Generated by Apache Maven Doxia at 2017-06-27 | Rendered using Apache Maven Fluido Skin 1.3.0 --> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta charset="UTF-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> - <meta name="Date-Revision-yyyymmdd" content="20170223" /> + <meta name="Date-Revision-yyyymmdd" content="20170627" /> <meta http-equiv="Content-Language" content="en" /> <title>Metron – Metron PCAP Backend</title> <link rel="stylesheet" href="../../css/apache-maven-fluido-1.3.0.min.css" /> @@ -30,14 +30,11 @@ <div class="container-fluid"> <div id="banner"> <div class="pull-left"> - <a href="http://metron.incubator.apache.org/" id="bannerLeft"> - <img src="../../images/metron-logo.png" alt="Apache Metron - Incubating" width="148px" height="48px"/> + <a href="http://metron.apache.org/" id="bannerLeft"> + <img src="../../images/metron-logo.png" alt="Apache Metron" width="148px" height="48px"/> </a> </div> - <div class="pull-right"> <a href="http://incubator.apache.org/" id="bannerRight"> - <img src="../../images/ApacheIncubating_Logo.png" alt="Apache Incubating" width="192px" height="48px"/> - </a> - </div> + <div class="pull-right"> </div> <div class="clear"><hr/></div> </div> @@ -51,8 +48,8 @@ </li> <li class="divider ">/</li> <li class=""> - <a href="http://metron.incubator.apache.org/" class="externalLink" title="Metron-Incubating"> - Metron-Incubating</a> + <a href="http://metron.apache.org/" class="externalLink" title="Metron"> + Metron</a> </li> <li class="divider ">/</li> <li class=""> @@ -64,8 +61,8 @@ - <li id="publishDate" class="pull-right">Last Published: 2017-02-23</li> <li class="divider pull-right">|</li> - <li id="projectVersion" class="pull-right">Version: 0.3.1</li> + <li id="publishDate" class="pull-right">Last Published: 2017-06-27</li> <li class="divider pull-right">|</li> + <li id="projectVersion" class="pull-right">Version: 0.4.0</li> </ul> </div> @@ -78,7 +75,7 @@ <ul class="nav nav-list"> <li class="nav-header">User Documentation</li> - + <li> <a href="../../index.html" title="Metron"> @@ -99,7 +96,7 @@ <i class="icon-chevron-right"></i> Analytics</a> </li> - + <li> <a href="../../metron-deployment/index.html" title="Deployment"> @@ -113,7 +110,21 @@ <i class="none"></i> Docker</a> </li> - + + <li> + + <a href="../../metron-interface/metron-config/index.html" title="Config"> + <i class="none"></i> + Config</a> + </li> + + <li> + + <a href="../../metron-interface/metron-rest/index.html" title="Rest"> + <i class="none"></i> + Rest</a> + </li> + <li> <a href="../../metron-platform/index.html" title="Platform"> @@ -127,13 +138,13 @@ <i class="none"></i> Api</a> </li> - + <li> <a href="../../metron-platform/metron-common/index.html" title="Common"> - <i class="none"></i> + <i class="icon-chevron-right"></i> Common</a> - </li> + </li> <li> @@ -174,9 +185,16 @@ <a href="#"><i class="none"></i>Pcap-backend</a> </li> + + <li> + + <a href="../../metron-platform/metron-writer/index.html" title="Writer"> + <i class="none"></i> + Writer</a> + </li> </ul> </li> - + <li> <a href="../../metron-sensors/index.html" title="Sensors"> @@ -207,7 +225,31 @@ <h1>Metron PCAP Backend</h1> <p><a name="Metron_PCAP_Backend"></a></p> -<p>The purpose of the Metron PCAP backend is to create a storm topology capable of ingesting rapidly raw packet capture data directly into HDFS from Kafka.</p> +<p>The purpose of the Metron PCAP backend is to create a storm topology capable of rapidly ingesting raw packet capture data directly into HDFS from Kafka.</p> + +<ul> + +<li><a href="#the-sensors-feeding-kafka">Sensors</a></li> + +<li><a href="#the-pcap-topology">PCAP Topology</a></li> + +<li><a href="#the-files-on-hdfs">HDFS Files</a></li> + +<li><a href="#Configuration">Configuration</a></li> + +<li><a href="#Starting_the_Topology">Starting the Topology</a></li> + +<li><a href="#Utilities">Utilities</a> + +<ul> + +<li><a href="#Inspector_Utility">Inspector Utility</a></li> + +<li><a href="#Query_Filter_Utility">Query Filter Utility</a></li> + </ul></li> + +<li><a href="#Performance_Tuning">Performance Tuning</a></li> +</ul> <div class="section"> <h2><a name="The_Sensors_Feeding_Kafka"></a>The Sensors Feeding Kafka</h2> <p>This component must be fed by fast packet capture components upstream via Kafka. The two supported components shipped with Metron are as follows:</p> @@ -257,15 +299,19 @@ <p>These files contain a set of packet data with headers on them in sequence files.</p></div> <div class="section"> <h2><a name="Configuration"></a>Configuration</h2> -<p>The configuration file for the Flux topology is located at <tt>$METRON_HOME/config/etc/env/pcap.properties</tt> and the possible options are as follows:</p> +<p>The configuration file for the Flux topology is located at <tt>$METRON_HOME/config/pcap.properties</tt> and the possible options are as follows:</p> <ul> <li><tt>spout.kafka.topic.pcap</tt> : The kafka topic to listen to</li> +<li><tt>storm.auto.credentials</tt> : The kerberos ticket renewal. If running on a kerberized cluster, this should be <tt>['org.apache.storm.security.auth.kerberos.AutoTGT']</tt></li> + +<li><tt>kafka.security.protocol</tt> : The security protocol to use for kafka. This should be <tt>PLAINTEXT</tt> for a non-kerberized cluster and probably <tt>SASL_PLAINTEXT</tt> for a kerberized cluster.</li> + <li><tt>kafka.zk</tt> : The comma separated zookeeper quorum (i.e. host:2181,host2:2181)</li> -<li><tt>kafka.pcap.start</tt> : One of <tt>START</tt>, <tt>END</tt>, <tt>WHERE_I_LEFT_OFF</tt> representing where to start listening on the queue.</li> +<li><tt>kafka.pcap.start</tt> : One of <tt>EARLIEST</tt>, <tt>LATEST</tt>, <tt>UNCOMMITTED_EARLIEST</tt>, <tt>UNCOMMITTED_LATEST</tt> representing where to start listening on the queue.</li> <li><tt>kafka.pcap.numPackets</tt> : The number of packets to keep in one file.</li> @@ -301,7 +347,7 @@ <li>fixed</li> -<li>query (Metron Stellar)</li> +<li>query (via Stellar)</li> </ul> <p>The tool is executed via </p> @@ -324,6 +370,7 @@ and end_time. Default is to use time in millis since the epoch. -dp,--ip_dst_port <arg> Destination port + -pf,--packet_filter <arg> Packet filter regex -et,--end_time <arg> Packet end time range. Default is current system time. -nr,--num_reducers <arg> The number of reducers to use. Default @@ -354,7 +401,217 @@ -h,--help Display help -q,--query <arg> Query string to use as a filter -st,--start_time <arg> (required) Packet start time range. -</pre></div></div></div></div></div> +</pre></div></div> +<p>The Query filter’s <tt>--query</tt> argument specifies the Stellar expression to execute on each packet. To interact with the packet, a few variables are exposed:</p> + +<ul> + +<li><tt>packet</tt> : The packet data (a <tt>byte[]</tt>)</li> + +<li><tt>ip_src_addr</tt> : The source address for the packet (a <tt>String</tt>)</li> + +<li><tt>ip_src_port</tt> : The source port for the packet (an <tt>Integer</tt>)</li> + +<li><tt>ip_dst_addr</tt> : The destination address for the packet (a <tt>String</tt>)</li> + +<li><tt>ip_dst_port</tt> : The destination port for the packet (an <tt>Integer</tt>)</li> +</ul></div> +<div class="section"> +<h4><a name="Binary_Regex"></a>Binary Regex</h4> +<p>Filtering can be done both by the packet header as well as via a binary regular expression which can be run on the packet payload itself. This filter can be specified via:</p> + +<ul> + +<li>The <tt>-pf</tt> or <tt>--packet_filter</tt> options for the fixed query filter</li> + +<li>The <tt>BYTEARRAY_MATCHER(pattern, data)</tt> Stellar function. The first argument is the regex pattern and the second argument is the data. The packet data will be exposed via the<tt>packet</tt> variable in Stellar.</li> +</ul> +<p>The format of this regular expression is described <a class="externalLink" href="https://github.com/nishihatapalmer/byteseek/blob/master/sequencesyntax.md">here</a>.</p></div></div></div> +<div class="section"> +<h2><a name="Performance_Tuning"></a>Performance Tuning</h2> +<p>The PCAP topology is extremely lightweight and functions as a Spout-only topology. In order to tune the topology, users currently must specify a combination of properties in pcap.properties as well as configuration in the pcap remote.yaml flux file itself. Tuning the number of partitions in your Kafka topic will have a dramatic impact on performance as well. We ran data into Kafka at 1.1 Gbps and our tests resulted in configuring 128 partitions for our kakfa topic along with the following settings in pcap.properties and remote.yaml (unrelated properties for performance have been removed):</p> +<div class="section"> +<h3><a name="pcap.properties_file"></a>pcap.properties file</h3> + +<div class="source"> +<div class="source"> +<pre>spout.kafka.topic.pcap=pcap +storm.topology.workers=16 +kafka.spout.parallelism=128 +kafka.pcap.numPackets=1000000000 +kafka.pcap.maxTimeMS=0 +hdfs.replication=1 +hdfs.sync.every=10000 +</pre></div></div> +<p>You’ll notice that the number of kakfa partitions equals the spout parallelism, and this is no coincidence. The ordering guarantees for a partition in Kafka enforces that you may have no more consumers than 1 per topic. Any additional parallelism will leave you with dormant threads consuming resources but performing no additional work. For our cluster with 4 Storm Supervisors, we found 16 workers to provide optimal throughput as well. We were largely IO bound rather than CPU bound with the incoming PCAP data.</p></div> +<div class="section"> +<h3><a name="remote.yaml"></a>remote.yaml</h3> +<p>In the flux file, we introduced the following configuration:</p> + +<div class="source"> +<div class="source"> +<pre>name: "pcap" +config: + topology.workers: ${storm.topology.workers} + topology.worker.childopts: ${topology.worker.childopts} + topology.auto-credentials: ${storm.auto.credentials} + topology.ackers.executors: 0 +components: + + # Any kafka props for the producer go here. + - id: "kafkaProps" + className: "java.util.HashMap" + configMethods: + - name: "put" + args: + - "value.deserializer" + - "org.apache.kafka.common.serialization.ByteArrayDeserializer" + - name: "put" + args: + - "key.deserializer" + - "org.apache.kafka.common.serialization.ByteArrayDeserializer" + - name: "put" + args: + - "group.id" + - "pcap" + - name: "put" + args: + - "security.protocol" + - "${kafka.security.protocol}" + - name: "put" + args: + - "poll.timeout.ms" + - 100 + - name: "put" + args: + - "offset.commit.period.ms" + - 30000 + - name: "put" + args: + - "session.timeout.ms" + - 30000 + - name: "put" + args: + - "max.uncommitted.offsets" + - 200000000 + - name: "put" + args: + - "max.poll.interval.ms" + - 10 + - name: "put" + args: + - "max.poll.records" + - 200000 + - name: "put" + args: + - "receive.buffer.bytes" + - 431072 + - name: "put" + args: + - "max.partition.fetch.bytes" + - 8097152 + + - id: "hdfsProps" + className: "java.util.HashMap" + configMethods: + - name: "put" + args: + - "io.file.buffer.size" + - 1000000 + - name: "put" + args: + - "dfs.blocksize" + - 1073741824 + + - id: "kafkaConfig" + className: "org.apache.metron.storm.kafka.flux.SimpleStormKafkaBuilder" + constructorArgs: + - ref: "kafkaProps" + # topic name + - "${spout.kafka.topic.pcap}" + - "${kafka.zk}" + configMethods: + - name: "setFirstPollOffsetStrategy" + args: + # One of EARLIEST, LATEST, UNCOMMITTED_EARLIEST, UNCOMMITTED_LATEST + - ${kafka.pcap.start} + + - id: "writerConfig" + className: "org.apache.metron.spout.pcap.HDFSWriterConfig" + configMethods: + - name: "withOutputPath" + args: + - "${kafka.pcap.out}" + - name: "withNumPackets" + args: + - ${kafka.pcap.numPackets} + - name: "withMaxTimeMS" + args: + - ${kafka.pcap.maxTimeMS} + - name: "withZookeeperQuorum" + args: + - "${kafka.zk}" + - name: "withSyncEvery" + args: + - ${hdfs.sync.every} + - name: "withReplicationFactor" + args: + - ${hdfs.replication} + - name: "withHDFSConfig" + args: + - ref: "hdfsProps" + - name: "withDeserializer" + args: + - "${kafka.pcap.ts_scheme}" + - "${kafka.pcap.ts_granularity}" +spouts: + - id: "kafkaSpout" + className: "org.apache.metron.spout.pcap.KafkaToHDFSSpout" + parallelism: ${kafka.spout.parallelism} + constructorArgs: + - ref: "kafkaConfig" + - ref: "writerConfig" + +</pre></div></div> +<div class="section"> +<h4><a name="Flux_Changes_Introduced"></a>Flux Changes Introduced</h4> +<div class="section"> +<h5><a name="Topology_Configuration"></a>Topology Configuration</h5> +<p>The only change here is <tt>topology.ackers.executors: 0</tt>, which disables Storm tuple acking for maximum throughput.</p></div> +<div class="section"> +<h5><a name="Kafka_configuration"></a>Kafka configuration</h5> + +<div class="source"> +<div class="source"> +<pre>poll.timeout.ms +offset.commit.period.ms +session.timeout.ms +max.uncommitted.offsets +max.poll.interval.ms +max.poll.records +receive.buffer.bytes +max.partition.fetch.bytes +</pre></div></div></div> +<div class="section"> +<h5><a name="Writer_Configuration"></a>Writer Configuration</h5> +<p>This is a combination of settings for the HDFSWriter (see pcap.properties values above) as well as HDFS.</p> +<p><b>HDFS config</b></p> +<p>Component config HashMap with the following properties:</p> + +<div class="source"> +<div class="source"> +<pre>io.file.buffer.size +dfs.blocksize +</pre></div></div> +<p><b>Writer config</b></p> +<p>References the HDFS props component specified above.</p> + +<div class="source"> +<div class="source"> +<pre> - name: "withHDFSConfig" + args: + - ref: "hdfsProps" +</pre></div></div></div></div></div></div> </div> </div> </div> @@ -363,8 +620,9 @@ <footer> <div class="container-fluid"> - <div class="row span12">Copyright © 2017. - All Rights Reserved. + <div class="row span12">Copyright © 2017 + <a href="https://www.apache.org">The Apache Software Foundation</a>. + All Rights Reserved. </div>
