It seems to me that especially at the Stellar level, or anywhere else users may 
enter values manually, we should be tolerant of both String and Long values 
when the semantics of the entry is an integral number.  In such a case, I would 
like to see (1) read the value in a way that accepts either String or Long 
input, and convert it to Long to prove it has appropriate content, then (2) 
convert to String, if that is the desired representation.

 

I think this would be satisfied by “coerceValue(Long.class).toString()”, where 
coerceValue() calls the apache.commons.beanutils conversion routines

 

IMHO,

--Matt

 

On 11/16/16, 8:31 AM, "Otto Fowler" <[email protected]> wrote:

 

    I’ll have a PR for this soon.

    

    It is hard to see where besides at the stellar level ( where you get the

    whole map and assuming map is OK ) it would ever be safe or valid to call

    getValue(Class).

    

    

    On November 16, 2016 at 11:23:46, Otto Fowler ([email protected])

    wrote:

    

    Right.

    

    This is also in HostsFromJSONLists adapter as well.

    

    I created a failing test for SimpleHBaseAdapterTest:

    

    @Test

    public void testEnrichLong() throws Exception {

      SimpleHBaseAdapter sha = new SimpleHBaseAdapter();

      sha.lookup = lookup;

      SensorEnrichmentConfig broSc =

    JSONUtils.INSTANCE.load(sourceConfigStr,

    SensorEnrichmentConfig.class);

      JSONObject actualMessage = sha.enrich(new CacheKey("test", "test", 
broSc));

      Assert.assertEquals(actualMessage, new JSONObject());

      actualMessage = sha.enrich(new CacheKey("ip_dst_addr", new Long(10L), 
broSc));

      Assert.assertEquals(actualMessage,new JSONObject());

    }

    

    I was able to prove out that replacing getValue(String.class) with

    getValue().toString() would work ( all the tests pass ).

    I agree that coerceValue() is better though and will try that.

    

    

    

    On November 16, 2016 at 11:15:46, Casey Stella ([email protected]) wrote:

    

    The core problem, from my understanding, is that the HBase enrichment does

    not support non-strings as keys and it's making the assumption that the

    indicator being passed is a String. In this case, it's a Long. CacheKey

    is a more general object and should be allowed to not be string dependent.

    In fact, there is already a method that is called "coerceValue" to coerce

    the value into the type of your choosing.

    

    In my opinion, the bug that this JIRA has unearthed is that we're making

    the assumptions on the cache key that we are not handling when doing HBase

    enrichments. This, I think should be corrected by coercing the value into

    a String type when doing HBase enrichments, which would mean changing the

    "getValue" calls to "coerceValue" calls at the following places:

    

    - SimpleHBaseAdapter line 79

    - ThreatIntelAdapter line 80 and line 60

    

    and converting the indicator argument to a String on line 148 of

    SimpleHBaseEnrichmentFunctions

    

    The other argument is that we should just be throwing a better error

    message and insist that the data comes in as a consistent type. I'm not so

    fond of this argument, personally.

    

    Casey

    

    On Wed, Nov 16, 2016 at 10:59 AM, Otto Fowler <[email protected]>

    wrote:

    

    > Looking for some background here. If you read the jira, the issue we have

    > is a fully numeric value for a field being looked at for enrichment is

    > causing an exception at enrichment time because we are using casting, and

    > trying to cast a Long to a String.

    >

    > We support getting the value the following ways:

    >

    > public Object getValue() {

    > return value;

    > }

    >

    > public <T> T getValue(Class<T> clazz) {

    > return clazz.cast(getValue());

    > }

    >

    >

    >

    > In practice, we pretty much always call this with either Map.class or

    > String.class.

    > All of the tests we have as well always use Strings. If we want a String,

    > based on CacheKey we cannot assume as we do the ability to cast to it, and

    > maybe should do getValue().toString() or something.

    >

    > In METRON-567 - it looks like the parser is mapping the csv input through

    > grok to either strings or longs based on their composition ( I have not

    > gone down the rabbit hole to the grok side of this yet to confirm where

    the

    > mapping to object type is done ). So that will be a problem as well. I

    > think based on my current understanding that each Cache is per message

    > operation, and we don’t have to worry that field _a may be Long one

    message

    > and String in the next.

    >

    > From the readme.md in enrichments, there doesn’t seem to be a restriction

    > on the ‘type’ of a field in the enrichment mappings to String.

    >

    > Any thoughts?

    >

    

 

Reply via email to