That sheds light.  Thanks, Nick.
--Matt

From: Nick Allen <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Thursday, May 25, 2017 at 3:06 PM
To: "[email protected]" <[email protected]>
Subject: Re: Metron HBase conditional enrichment

Each topology has its own uber-jar that is built from all of it's dependencies. 
 It's classpath is basically whatever is in the uber-jar.

That's why running with -pl against the project from which the uber jar is 
built should identify the Stellar functions available to each topology.

When running the REPL from a deployed instance of Metron it pulls in all of the 
jars deployed to /usr/metron/(version)/lib.

On May 25, 2017 3:29 PM, "Matt Foley" 
<[email protected]<mailto:[email protected]>> wrote:
Nick is correct that in any given environment, only Stellar functions defined 
in jars on the current classpath will be available.

When running with the maven exec:java plugin, as below, this means only jars 
declared (or transitively required) as dependencies to the given project.

In the installed environment, however, it is strictly a matter of the 
configured classpath.  I thought (I could be wrong), that all the metron jars 
are installed together (in /usr/metron/<version>/lib/* , in CentOS 7), and that 
all those jars will be added to the classpath for all topologies.

Nick, you’re much more familiar with the install stuff than I, please clarify 
if you can how the classpath is configured for the running topologies in the 
installed env.

At any rate, it is the classpath currently being run under that determines the 
availability of Stellar functions.
Thanks,
--Matt

From: Nick Allen <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Thursday, May 25, 2017 at 12:05 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>

Subject: Re: Metron HBase conditional enrichment

Correct me if I am wrong, Matt, but I believe that changing the project that 
you pass to the -pl switch will allow you to see exactly what Stellar functions 
would be available in each topology.  You just have to refer to whichever 
project drives the topology.

This might help answer Ali's previous question as to what functions are 
available where.  And of course, if some function is not available where it is 
needed, then it is a simply matter of changing the dependencies to make it 
available.

For example, these functions are available from the Profiler.

$ mvn exec:java \
-Dexec.mainClass="org.apache.metron.common.stellar.shell.StellarShell" \
-pl metron-analytics/metron-profiler
...
Stellar, Go!
Please note that functions are loading lazily in the background and will be 
unavailable until loaded fully.
[Stellar]>>> Functions loaded, you may refer to functions now...
%functions
ABS, APPEND_IF_MISSING, BIN, BLOOM_ADD, BLOOM_EXISTS, BLOOM_INIT, BLOOM_MERGE, 
CHOMP, CHOP, COUNT_MATCHES, DAY_OF_MONTH, DAY_OF_WEEK, DAY_OF_YEAR, 
DOMAIN_REMOVE_SUBDOMAINS, DOMAIN_REMOVE_TLD, DOMAIN_TO_TLD, ENDS_WITH, 
FILL_LEFT, FILL_RIGHT, FILTER, FORMAT, GET, GET_FIRST, GET_LAST, HLLP_ADD, 
HLLP_CARDINALITY, HLLP_INIT, HLLP_MERGE, IN_SUBNET, IS_DATE, IS_DOMAIN, 
IS_EMAIL, IS_EMPTY, IS_INTEGER, IS_IP, IS_URL, JOIN, LENGTH, LIST_ADD, 
MAAS_GET_ENDPOINT, MAAS_MODEL_APPLY, MAP, MAP_EXISTS, MAP_GET, MONTH, 
OUTLIER_MAD_ADD, OUTLIER_MAD_SCORE, OUTLIER_MAD_STATE_MERGE, 
PREPEND_IF_MISSING, PROFILE_FIXED, PROFILE_GET, PROFILE_WINDOW, 
PROTOCOL_TO_NAME, REDUCE, REGEXP_MATCH, SPLIT, STARTS_WITH, STATS_ADD, 
STATS_BIN, STATS_COUNT, STATS_GEOMETRIC_MEAN, STATS_INIT, STATS_KURTOSIS, 
STATS_MAX, STATS_MEAN, STATS_MERGE, STATS_MIN, STATS_PERCENTILE, 
STATS_POPULATION_VARIANCE, STATS_QUADRATIC_MEAN, STATS_SD, STATS_SKEWNESS, 
STATS_SUM, STATS_SUM_LOGS, STATS_SUM_SQUARES, STATS_VARIANCE, STRING_ENTROPY, 
SYSTEM_ENV_GET, SYSTEM_PROPERTY_GET, TO_DOUBLE, TO_EPOCH_TIMESTAMP, TO_FLOAT, 
TO_INTEGER, TO_LONG, TO_LOWER, TO_STRING, TO_UPPER, TRIM, URL_TO_HOST, 
URL_TO_PATH, URL_TO_PORT, URL_TO_PROTOCOL, WEEK_OF_MONTH, WEEK_OF_YEAR, YEAR


And these functions are available in the Enrichment topology.


$ mvn exec:java \
-Dexec.mainClass="org.apache.metron.common.stellar.shell.StellarShell" \
-pl metron-platform/metron-enrichment/
...
Stellar, Go!
Please note that functions are loading lazily in the background and will be 
unavailable until loaded fully.
[Stellar]>>> Functions loaded, you may refer to functions now...
[Stellar]>>> %functions
ABS, APPEND_IF_MISSING, BIN, BLOOM_ADD, BLOOM_EXISTS, BLOOM_INIT, BLOOM_MERGE, 
CHOMP, CHOP, COUNT_MATCHES, DAY_OF_MONTH, DAY_OF_WEEK, DAY_OF_YEAR, 
DOMAIN_REMOVE_SUBDOMAINS, DOMAIN_REMOVE_TLD, DOMAIN_TO_TLD, ENDS_WITH, 
ENRICHMENT_EXISTS, ENRICHMENT_GET, FILL_LEFT, FILL_RIGHT, FILTER, FORMAT, 
GEO_GET, GET, GET_FIRST, GET_LAST, HLLP_ADD, HLLP_CARDINALITY, HLLP_INIT, 
HLLP_MERGE, IN_SUBNET, IS_DATE, IS_DOMAIN, IS_EMAIL, IS_EMPTY, IS_INTEGER, 
IS_IP, IS_URL, JOIN, LENGTH, LIST_ADD, MAAS_GET_ENDPOINT, MAAS_MODEL_APPLY, 
MAP, MAP_EXISTS, MAP_GET, MONTH, OUTLIER_MAD_ADD, OUTLIER_MAD_SCORE, 
OUTLIER_MAD_STATE_MERGE, PREPEND_IF_MISSING, PROFILE_FIXED, PROFILE_GET, 
PROFILE_WINDOW, PROTOCOL_TO_NAME, REDUCE, REGEXP_MATCH, SPLIT, STARTS_WITH, 
STATS_ADD, STATS_BIN, STATS_COUNT, STATS_GEOMETRIC_MEAN, STATS_INIT, 
STATS_KURTOSIS, STATS_MAX, STATS_MEAN, STATS_MERGE, STATS_MIN, 
STATS_PERCENTILE, STATS_POPULATION_VARIANCE, STATS_QUADRATIC_MEAN, STATS_SD, 
STATS_SKEWNESS, STATS_SUM, STATS_SUM_LOGS, STATS_SUM_SQUARES, STATS_VARIANCE, 
STRING_ENTROPY, SYSTEM_ENV_GET, SYSTEM_PROPERTY_GET, TO_DOUBLE, 
TO_EPOCH_TIMESTAMP, TO_FLOAT, TO_INTEGER, TO_LONG, TO_LOWER, TO_STRING, 
TO_UPPER, TRIM, URL_TO_HOST, URL_TO_PATH, URL_TO_PORT, URL_TO_PROTOCOL, 
WEEK_OF_MONTH, WEEK_OF_YEAR, YEAR

For example, I can see from this that `GEO_GET` during Enrichment, but is not 
available in the Profiler right now.  This makes sense because `GEO_GET` is 
defined in metron-enrichment.

On Thu, May 25, 2017 at 1:58 PM, Nick Allen 
<[email protected]<mailto:[email protected]>> wrote:
Thanks, Matt.  That command-line doesn't work for me; even after updating 
version number.  Not sure why.  This is what I tend to run after a build and 
install of Metron.  Maybe something is different in my environment.

mvn clean install -DskipTests

mvn exec:java 
-Dexec.mainClass="org.apache.metron.common.stellar.shell.StellarShell" -pl 
metron-platform/metron-common

For example...

$ mvn exec:java 
-Dexec.mainClass="org.apache.metron.common.stellar.shell.StellarShell" -pl 
metron-platform/metron-common
...

Stellar, Go!
Please note that functions are loading lazily in the background and will be 
unavailable until loaded fully.
[Stellar]>>> Functions loaded, you may refer to functions now...

[Stellar]>>> %functions
APPEND_IF_MISSING, BLOOM_ADD, BLOOM_EXISTS, BLOOM_INIT, BLOOM_MERGE, CHOMP, 
CHOP, COUNT_MATCHES, DAY_OF_MONTH, DAY_OF_WEEK, DAY_OF_YEAR, 
DOMAIN_REMOVE_SUBDOMAINS, DOMAIN_REMOVE_TLD, DOMAIN_TO_TLD, ENDS_WITH, 
FILL_LEFT, FILL_RIGHT, FILTER, FORMAT, GET, GET_FIRST, GET_LAST, IN_SUBNET, 
IS_DATE, IS_DOMAIN, IS_EMAIL, IS_EMPTY, IS_INTEGER, IS_IP, IS_URL, JOIN, 
LENGTH, LIST_ADD, MAAS_GET_ENDPOINT, MAAS_MODEL_APPLY, MAP, MAP_EXISTS, 
MAP_GET, MONTH, PREPEND_IF_MISSING, PROTOCOL_TO_NAME, REDUCE, REGEXP_MATCH, 
SPLIT, STARTS_WITH, STRING_ENTROPY, SYSTEM_ENV_GET, SYSTEM_PROPERTY_GET, 
TO_DOUBLE, TO_EPOCH_TIMESTAMP, TO_FLOAT, TO_INTEGER, TO_LONG, TO_LOWER, 
TO_STRING, TO_UPPER, TRIM, URL_TO_HOST, URL_TO_PATH, URL_TO_PORT, 
URL_TO_PROTOCOL, WEEK_OF_MONTH, WEEK_OF_YEAR, YEAR
[Stellar]>>>

On Thu, May 25, 2017 at 1:31 PM, Matt Foley 
<[email protected]<mailto:[email protected]>> wrote:
Hi Ali,
When writing Stellar statements, it is convenient to test them out in the REPL, 
which can be invoked via some variant of the commands in 
https://github.com/apache/metron/blob/master/metron-platform/metron-common/src/main/scripts/stellar
 , depending on the particular environment you’re working in .  For instance, 
from the root of a git clone (that has already been successfully compiled), you 
can run:

java -cp 
metron-platform/metron-enrichment/target/metron-enrichment-0.3.1-uber.jar:metron-platform/metron-data-management/target/metron-data-management-0.3.1.jar:metron-platform/metron-common/target/metron-common-0.3.1.jar:metron-platform/metron-management/target/metron-management-0.3.1.jar:metron-platform/metron-parsers/target/metron-parsers-0.3.1-uber.jar
 org.apache.metron.common.stellar.shell.StellarShell

As you probably infer from the above, any jar that has @Stellar annotated 
classes has to be included in the cp, or those stellar operators won’t be 
available.  The main jar is metron-common, which will give you the REPL and the 
“base” operator set, but many interesting ops are in those other jars.  (If I 
missed one, just add it to the cp.)

Once in the REPL you can type `%functions` to get a list of all available 
stellar operators.  Basically, any class annotated with @Stellar will 
automatically be loaded, both in the REPL and in the installed runtime 
environment.

Cheers,
--Matt

From: Otto Fowler <[email protected]<mailto:[email protected]>>
Reply-To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Thursday, May 25, 2017 at 5:23 AM
To: Nick Allen <[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: Metron HBase conditional enrichment

I think most of those restricted functions are in the metron-managment section.



On May 25, 2017 at 07:27:24, Nick Allen 
([email protected]<mailto:[email protected]>) wrote:
> everywhere I can use Stellar DSL, all of the functions have been implemented 
> and ready to use?

Generally, yes, you are right.

I vaguely remember a couple instances of functions that are useful in the REPL 
only, but I cannot remember what those are right now.  Hopefully we have those 
doc'd appropriately.



On Wed, May 24, 2017 at 10:38 PM, Ali Nazemian 
<[email protected]<mailto:[email protected]>> wrote:
Hi Nick,

I was not sure about the implementation, so does it generally mean everywhere I 
can use Stellar DSL, all of the functions have been implemented and ready to 
use?

Cheers,
Ali

On Thu, May 25, 2017 at 2:52 AM, Nick Allen 
<[email protected]<mailto:[email protected]>> wrote:
> can I do the concatenation on the fly at the enrichment level, so I don't 
> need to store this temp field in Elasticsearch/HDFS.

Sure, absolutely.

> Moreover, I need to have a conditional enrichment to say if you couldn't find 
> any match for "tenant_name+device_type+device_name" lookup for 
> "tenant_name+device_type+default_device".

Yes, you can.  You've got if/else, JOIN, IS_EMPTY, and others that should make 
implementing this logic pretty easy.




On Tue, May 23, 2017 at 10:34 PM, Ali Nazemian 
<[email protected]<mailto:[email protected]>> wrote:
Hi,

I was wondering how I can manage Stellar syntax to be aligned with the 
following structure for the HBase enrichment:

HBase_row_key: tenant_name+device_type+device_name

At the high-level,  I need to create a separate field via a post-parse Stellar 
function to be a concatenation of tenan_name, device_type and device_name. 
Let's call this field "key". Basically, I need to do the enrichment on the 
"key" which would be corresponding to the HBase row key. My first question is 
can I do the concatenation on the fly at the enrichment level, so I don't need 
to store this temp field in Elasticsearch/HDFS.

Moreover, I need to have a conditional enrichment to say if you couldn't find 
any match for "tenant_name+device_type+device_name" lookup for 
"tenant_name+device_type+default_device". The second question would be how can 
I manage conditional enrichment like this one. I would be really grateful if 
you can provide some example.

Regards,
Ali




--
A.Nazemian




Reply via email to