At this point in the execution, there is no knowledge of the type, so you can’t ask for Long, since you don’t know. That would have to be plumbed in.
There is a possibility that the STELLAR conversion functions could be used to patch things up here ( although I think that proposed fix is still correct ). My question with that would be how to use stellar in the enrichment configuration to coerce types ( I think it would be better to not do enrichment focused type coercion in the parser, but maybe not ). On November 16, 2016 at 13:35:12, Matt Foley ([email protected]) wrote: It seems to me that especially at the Stellar level, or anywhere else users may enter values manually, we should be tolerant of both String and Long values when the semantics of the entry is an integral number. In such a case, I would like to see (1) read the value in a way that accepts either String or Long input, and convert it to Long to prove it has appropriate content, then (2) convert to String, if that is the desired representation. I think this would be satisfied by “coerceValue(Long.class).toString()”, where coerceValue() calls the apache.commons.beanutils conversion routines IMHO, --Matt On 11/16/16, 8:31 AM, "Otto Fowler" <[email protected]> wrote: I’ll have a PR for this soon. It is hard to see where besides at the stellar level ( where you get the whole map and assuming map is OK ) it would ever be safe or valid to call getValue(Class). On November 16, 2016 at 11:23:46, Otto Fowler ([email protected]) wrote: Right. This is also in HostsFromJSONLists adapter as well. I created a failing test for SimpleHBaseAdapterTest: @Test public void testEnrichLong() throws Exception { SimpleHBaseAdapter sha = new SimpleHBaseAdapter(); sha.lookup = lookup; SensorEnrichmentConfig broSc = JSONUtils.INSTANCE.load(sourceConfigStr, SensorEnrichmentConfig.class); JSONObject actualMessage = sha.enrich(new CacheKey("test", "test", broSc)); Assert.assertEquals(actualMessage, new JSONObject()); actualMessage = sha.enrich(new CacheKey("ip_dst_addr", new Long(10L), broSc)); Assert.assertEquals(actualMessage,new JSONObject()); } I was able to prove out that replacing getValue(String.class) with getValue().toString() would work ( all the tests pass ). I agree that coerceValue() is better though and will try that. On November 16, 2016 at 11:15:46, Casey Stella ([email protected]) wrote: The core problem, from my understanding, is that the HBase enrichment does not support non-strings as keys and it's making the assumption that the indicator being passed is a String. In this case, it's a Long. CacheKey is a more general object and should be allowed to not be string dependent. In fact, there is already a method that is called "coerceValue" to coerce the value into the type of your choosing. In my opinion, the bug that this JIRA has unearthed is that we're making the assumptions on the cache key that we are not handling when doing HBase enrichments. This, I think should be corrected by coercing the value into a String type when doing HBase enrichments, which would mean changing the "getValue" calls to "coerceValue" calls at the following places: - SimpleHBaseAdapter line 79 - ThreatIntelAdapter line 80 and line 60 and converting the indicator argument to a String on line 148 of SimpleHBaseEnrichmentFunctions The other argument is that we should just be throwing a better error message and insist that the data comes in as a consistent type. I'm not so fond of this argument, personally. Casey On Wed, Nov 16, 2016 at 10:59 AM, Otto Fowler <[email protected]> wrote: > Looking for some background here. If you read the jira, the issue we have > is a fully numeric value for a field being looked at for enrichment is > causing an exception at enrichment time because we are using casting, and > trying to cast a Long to a String. > > We support getting the value the following ways: > > public Object getValue() { > return value; > } > > public <T> T getValue(Class<T> clazz) { > return clazz.cast(getValue()); > } > > > > In practice, we pretty much always call this with either Map.class or > String.class. > All of the tests we have as well always use Strings. If we want a String, > based on CacheKey we cannot assume as we do the ability to cast to it, and > maybe should do getValue().toString() or something. > > In METRON-567 - it looks like the parser is mapping the csv input through > grok to either strings or longs based on their composition ( I have not > gone down the rabbit hole to the grok side of this yet to confirm where the > mapping to object type is done ). So that will be a problem as well. I > think based on my current understanding that each Cache is per message > operation, and we don’t have to worry that field _a may be Long one message > and String in the next. > > From the readme.md in enrichments, there doesn’t seem to be a restriction > on the ‘type’ of a field in the enrichment mappings to String. > > Any thoughts? >
