That sheds light. Thanks, Nick. --Matt From: Nick Allen <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Thursday, May 25, 2017 at 3:06 PM To: "[email protected]" <[email protected]> Subject: Re: Metron HBase conditional enrichment
Each topology has its own uber-jar that is built from all of it's dependencies. It's classpath is basically whatever is in the uber-jar. That's why running with -pl against the project from which the uber jar is built should identify the Stellar functions available to each topology. When running the REPL from a deployed instance of Metron it pulls in all of the jars deployed to /usr/metron/(version)/lib. On May 25, 2017 3:29 PM, "Matt Foley" <[email protected]<mailto:[email protected]>> wrote: Nick is correct that in any given environment, only Stellar functions defined in jars on the current classpath will be available. When running with the maven exec:java plugin, as below, this means only jars declared (or transitively required) as dependencies to the given project. In the installed environment, however, it is strictly a matter of the configured classpath. I thought (I could be wrong), that all the metron jars are installed together (in /usr/metron/<version>/lib/* , in CentOS 7), and that all those jars will be added to the classpath for all topologies. Nick, you’re much more familiar with the install stuff than I, please clarify if you can how the classpath is configured for the running topologies in the installed env. At any rate, it is the classpath currently being run under that determines the availability of Stellar functions. Thanks, --Matt From: Nick Allen <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Thursday, May 25, 2017 at 12:05 PM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: Metron HBase conditional enrichment Correct me if I am wrong, Matt, but I believe that changing the project that you pass to the -pl switch will allow you to see exactly what Stellar functions would be available in each topology. You just have to refer to whichever project drives the topology. This might help answer Ali's previous question as to what functions are available where. And of course, if some function is not available where it is needed, then it is a simply matter of changing the dependencies to make it available. For example, these functions are available from the Profiler. $ mvn exec:java \ -Dexec.mainClass="org.apache.metron.common.stellar.shell.StellarShell" \ -pl metron-analytics/metron-profiler ... Stellar, Go! Please note that functions are loading lazily in the background and will be unavailable until loaded fully. [Stellar]>>> Functions loaded, you may refer to functions now... %functions ABS, APPEND_IF_MISSING, BIN, BLOOM_ADD, BLOOM_EXISTS, BLOOM_INIT, BLOOM_MERGE, CHOMP, CHOP, COUNT_MATCHES, DAY_OF_MONTH, DAY_OF_WEEK, DAY_OF_YEAR, DOMAIN_REMOVE_SUBDOMAINS, DOMAIN_REMOVE_TLD, DOMAIN_TO_TLD, ENDS_WITH, FILL_LEFT, FILL_RIGHT, FILTER, FORMAT, GET, GET_FIRST, GET_LAST, HLLP_ADD, HLLP_CARDINALITY, HLLP_INIT, HLLP_MERGE, IN_SUBNET, IS_DATE, IS_DOMAIN, IS_EMAIL, IS_EMPTY, IS_INTEGER, IS_IP, IS_URL, JOIN, LENGTH, LIST_ADD, MAAS_GET_ENDPOINT, MAAS_MODEL_APPLY, MAP, MAP_EXISTS, MAP_GET, MONTH, OUTLIER_MAD_ADD, OUTLIER_MAD_SCORE, OUTLIER_MAD_STATE_MERGE, PREPEND_IF_MISSING, PROFILE_FIXED, PROFILE_GET, PROFILE_WINDOW, PROTOCOL_TO_NAME, REDUCE, REGEXP_MATCH, SPLIT, STARTS_WITH, STATS_ADD, STATS_BIN, STATS_COUNT, STATS_GEOMETRIC_MEAN, STATS_INIT, STATS_KURTOSIS, STATS_MAX, STATS_MEAN, STATS_MERGE, STATS_MIN, STATS_PERCENTILE, STATS_POPULATION_VARIANCE, STATS_QUADRATIC_MEAN, STATS_SD, STATS_SKEWNESS, STATS_SUM, STATS_SUM_LOGS, STATS_SUM_SQUARES, STATS_VARIANCE, STRING_ENTROPY, SYSTEM_ENV_GET, SYSTEM_PROPERTY_GET, TO_DOUBLE, TO_EPOCH_TIMESTAMP, TO_FLOAT, TO_INTEGER, TO_LONG, TO_LOWER, TO_STRING, TO_UPPER, TRIM, URL_TO_HOST, URL_TO_PATH, URL_TO_PORT, URL_TO_PROTOCOL, WEEK_OF_MONTH, WEEK_OF_YEAR, YEAR And these functions are available in the Enrichment topology. $ mvn exec:java \ -Dexec.mainClass="org.apache.metron.common.stellar.shell.StellarShell" \ -pl metron-platform/metron-enrichment/ ... Stellar, Go! Please note that functions are loading lazily in the background and will be unavailable until loaded fully. [Stellar]>>> Functions loaded, you may refer to functions now... [Stellar]>>> %functions ABS, APPEND_IF_MISSING, BIN, BLOOM_ADD, BLOOM_EXISTS, BLOOM_INIT, BLOOM_MERGE, CHOMP, CHOP, COUNT_MATCHES, DAY_OF_MONTH, DAY_OF_WEEK, DAY_OF_YEAR, DOMAIN_REMOVE_SUBDOMAINS, DOMAIN_REMOVE_TLD, DOMAIN_TO_TLD, ENDS_WITH, ENRICHMENT_EXISTS, ENRICHMENT_GET, FILL_LEFT, FILL_RIGHT, FILTER, FORMAT, GEO_GET, GET, GET_FIRST, GET_LAST, HLLP_ADD, HLLP_CARDINALITY, HLLP_INIT, HLLP_MERGE, IN_SUBNET, IS_DATE, IS_DOMAIN, IS_EMAIL, IS_EMPTY, IS_INTEGER, IS_IP, IS_URL, JOIN, LENGTH, LIST_ADD, MAAS_GET_ENDPOINT, MAAS_MODEL_APPLY, MAP, MAP_EXISTS, MAP_GET, MONTH, OUTLIER_MAD_ADD, OUTLIER_MAD_SCORE, OUTLIER_MAD_STATE_MERGE, PREPEND_IF_MISSING, PROFILE_FIXED, PROFILE_GET, PROFILE_WINDOW, PROTOCOL_TO_NAME, REDUCE, REGEXP_MATCH, SPLIT, STARTS_WITH, STATS_ADD, STATS_BIN, STATS_COUNT, STATS_GEOMETRIC_MEAN, STATS_INIT, STATS_KURTOSIS, STATS_MAX, STATS_MEAN, STATS_MERGE, STATS_MIN, STATS_PERCENTILE, STATS_POPULATION_VARIANCE, STATS_QUADRATIC_MEAN, STATS_SD, STATS_SKEWNESS, STATS_SUM, STATS_SUM_LOGS, STATS_SUM_SQUARES, STATS_VARIANCE, STRING_ENTROPY, SYSTEM_ENV_GET, SYSTEM_PROPERTY_GET, TO_DOUBLE, TO_EPOCH_TIMESTAMP, TO_FLOAT, TO_INTEGER, TO_LONG, TO_LOWER, TO_STRING, TO_UPPER, TRIM, URL_TO_HOST, URL_TO_PATH, URL_TO_PORT, URL_TO_PROTOCOL, WEEK_OF_MONTH, WEEK_OF_YEAR, YEAR For example, I can see from this that `GEO_GET` during Enrichment, but is not available in the Profiler right now. This makes sense because `GEO_GET` is defined in metron-enrichment. On Thu, May 25, 2017 at 1:58 PM, Nick Allen <[email protected]<mailto:[email protected]>> wrote: Thanks, Matt. That command-line doesn't work for me; even after updating version number. Not sure why. This is what I tend to run after a build and install of Metron. Maybe something is different in my environment. mvn clean install -DskipTests mvn exec:java -Dexec.mainClass="org.apache.metron.common.stellar.shell.StellarShell" -pl metron-platform/metron-common For example... $ mvn exec:java -Dexec.mainClass="org.apache.metron.common.stellar.shell.StellarShell" -pl metron-platform/metron-common ... Stellar, Go! Please note that functions are loading lazily in the background and will be unavailable until loaded fully. [Stellar]>>> Functions loaded, you may refer to functions now... [Stellar]>>> %functions APPEND_IF_MISSING, BLOOM_ADD, BLOOM_EXISTS, BLOOM_INIT, BLOOM_MERGE, CHOMP, CHOP, COUNT_MATCHES, DAY_OF_MONTH, DAY_OF_WEEK, DAY_OF_YEAR, DOMAIN_REMOVE_SUBDOMAINS, DOMAIN_REMOVE_TLD, DOMAIN_TO_TLD, ENDS_WITH, FILL_LEFT, FILL_RIGHT, FILTER, FORMAT, GET, GET_FIRST, GET_LAST, IN_SUBNET, IS_DATE, IS_DOMAIN, IS_EMAIL, IS_EMPTY, IS_INTEGER, IS_IP, IS_URL, JOIN, LENGTH, LIST_ADD, MAAS_GET_ENDPOINT, MAAS_MODEL_APPLY, MAP, MAP_EXISTS, MAP_GET, MONTH, PREPEND_IF_MISSING, PROTOCOL_TO_NAME, REDUCE, REGEXP_MATCH, SPLIT, STARTS_WITH, STRING_ENTROPY, SYSTEM_ENV_GET, SYSTEM_PROPERTY_GET, TO_DOUBLE, TO_EPOCH_TIMESTAMP, TO_FLOAT, TO_INTEGER, TO_LONG, TO_LOWER, TO_STRING, TO_UPPER, TRIM, URL_TO_HOST, URL_TO_PATH, URL_TO_PORT, URL_TO_PROTOCOL, WEEK_OF_MONTH, WEEK_OF_YEAR, YEAR [Stellar]>>> On Thu, May 25, 2017 at 1:31 PM, Matt Foley <[email protected]<mailto:[email protected]>> wrote: Hi Ali, When writing Stellar statements, it is convenient to test them out in the REPL, which can be invoked via some variant of the commands in https://github.com/apache/metron/blob/master/metron-platform/metron-common/src/main/scripts/stellar , depending on the particular environment you’re working in . For instance, from the root of a git clone (that has already been successfully compiled), you can run: java -cp metron-platform/metron-enrichment/target/metron-enrichment-0.3.1-uber.jar:metron-platform/metron-data-management/target/metron-data-management-0.3.1.jar:metron-platform/metron-common/target/metron-common-0.3.1.jar:metron-platform/metron-management/target/metron-management-0.3.1.jar:metron-platform/metron-parsers/target/metron-parsers-0.3.1-uber.jar org.apache.metron.common.stellar.shell.StellarShell As you probably infer from the above, any jar that has @Stellar annotated classes has to be included in the cp, or those stellar operators won’t be available. The main jar is metron-common, which will give you the REPL and the “base” operator set, but many interesting ops are in those other jars. (If I missed one, just add it to the cp.) Once in the REPL you can type `%functions` to get a list of all available stellar operators. Basically, any class annotated with @Stellar will automatically be loaded, both in the REPL and in the installed runtime environment. Cheers, --Matt From: Otto Fowler <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Thursday, May 25, 2017 at 5:23 AM To: Nick Allen <[email protected]<mailto:[email protected]>>, "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: Metron HBase conditional enrichment I think most of those restricted functions are in the metron-managment section. On May 25, 2017 at 07:27:24, Nick Allen ([email protected]<mailto:[email protected]>) wrote: > everywhere I can use Stellar DSL, all of the functions have been implemented > and ready to use? Generally, yes, you are right. I vaguely remember a couple instances of functions that are useful in the REPL only, but I cannot remember what those are right now. Hopefully we have those doc'd appropriately. On Wed, May 24, 2017 at 10:38 PM, Ali Nazemian <[email protected]<mailto:[email protected]>> wrote: Hi Nick, I was not sure about the implementation, so does it generally mean everywhere I can use Stellar DSL, all of the functions have been implemented and ready to use? Cheers, Ali On Thu, May 25, 2017 at 2:52 AM, Nick Allen <[email protected]<mailto:[email protected]>> wrote: > can I do the concatenation on the fly at the enrichment level, so I don't > need to store this temp field in Elasticsearch/HDFS. Sure, absolutely. > Moreover, I need to have a conditional enrichment to say if you couldn't find > any match for "tenant_name+device_type+device_name" lookup for > "tenant_name+device_type+default_device". Yes, you can. You've got if/else, JOIN, IS_EMPTY, and others that should make implementing this logic pretty easy. On Tue, May 23, 2017 at 10:34 PM, Ali Nazemian <[email protected]<mailto:[email protected]>> wrote: Hi, I was wondering how I can manage Stellar syntax to be aligned with the following structure for the HBase enrichment: HBase_row_key: tenant_name+device_type+device_name At the high-level, I need to create a separate field via a post-parse Stellar function to be a concatenation of tenan_name, device_type and device_name. Let's call this field "key". Basically, I need to do the enrichment on the "key" which would be corresponding to the HBase row key. My first question is can I do the concatenation on the fly at the enrichment level, so I don't need to store this temp field in Elasticsearch/HDFS. Moreover, I need to have a conditional enrichment to say if you couldn't find any match for "tenant_name+device_type+device_name" lookup for "tenant_name+device_type+default_device". The second question would be how can I manage conditional enrichment like this one. I would be really grateful if you can provide some example. Regards, Ali -- A.Nazemian
