Andre, Sounds good. I suspect we can find some value there so if you're willing to take up the task, would you be willing to perhaps open up a JIRA? We can move this discussion to there and kind of flesh out what it might look like. As mentioned, I think the case, and its ilk, is a common pattern so perhaps we can draw from the community at large to see what makes sense and do some iterations to get a processor that closes this gap in functionality.
On Wed, Jun 22, 2016 at 8:31 PM, Andre <[email protected]> wrote: > Aldrin, > > ParseKV definitely seems to fulfill a certain need but am curious as to how > > we might make it provide a bit more coverage of similar formats. With > > configuration that allows specification of both the key value separators > > (in the example you provided a space, or new line) and the delimiter of > the > > pairings, this would possibly provide support for types of files as well, > > such as Properties/ini file formats. I do find myself uncertain of how > > much it would apply to the latter cases though. I can see how the format > > would map pretty nicely to columnar type stores. > > > > This is exactly what I have in mind. Not only plain kv but the ability to: > > 1. run in "shlex like" mode parsing (escaping single quotes, double quotes, > slashes, etc) > 2. define the key value separator > 3. define the kv pair separator > > It may make sense to allow user to define the EOL demarcation character as > well . > > > > Might you be able to expand on how this information would typically be > > handled downstream? > > > > Good question. Given attributes don't seem to be able to contain > unserialized arrays, I suspect replacing content with JSON downstream would > be a handy choice but I am open to different suggestions. > > Cheers. > > > > > On Wed, Jun 22, 2016 at 10:35 AM, Andre <[email protected]> wrote: > > > > > Aldrin, > > > > > > On Wed, Jun 22, 2016 at 12:24 AM, Aldrin Piri <[email protected]> > > > wrote: > > > > > > > Concerning the ParseKV, are you aware of the getDelimitedField[1] > > > function > > > > in Expression Language? I think this may take care of this case for > > > > handling these items. > > > > > > > > > > I am aware of getDelimitedField but I found a few cases where using > > becomes > > > a bit challenging: > > > > > > * Multiple instances of the same key and poorly defined format(note how > > > just one field uses quotes): > > > > > > [email protected] [email protected] [email protected] subject="I had > > enough > > > of this" > > > > > > * Variable set of keys (tag wasn't present, now it is): > > > > > > [email protected] [email protected] [email protected] [email protected] > > > tag=important tag=vip tag=tag1 ... tag=tag55 subject=I had enough of > this > > > > > > If you think if reasonably doable I am happy to reconsider. > > > > > > > > > > > > For the security folks like me, QueryBulkWhois and QueryDNS are very > > > different beasts: > > > > > > * QueryDNS does what a normal DNS resolver does, but because of the > > parsing > > > mechanism it can be used to handle responses in a smart way. As such > one > > > can use QueryDNS to use DNS based API (ShadowServer, Cymru, Cisco > > > SenderBase [1]), RBLs (Spamhaus, etc). > > > > > > * Enters QueryBulkWhois: batching optimises queries by allowing a large > > > number of subjects to be submitted using a single request. > > > > > > Yes, it may BulkWhois may be offered by providers that may also provide > > API > > > but these are note restricted to overlapping offerings, however > projects > > > like "Prefix WhoIs Project" only offer Whois with no DNS API available > at > > > all. > > > > > > > > > [1] > > > > > > > > > http://stackoverflow.com/questions/14145886/how-to-programmatically-query-senderbase-org > > > > > > > > > > With the QueryBulkWhois API, does it make sense to roll this into the > > > > QueryDNS as a configurable property to do batch? Performing a > cursory > > > > review of the PR, it looks like this would potentially be targeting > > those > > > > same servers. Are batch lookups to more web service oriented > endpoints > > > as > > > > opposed to just querying DNS? > > > > > > > > --aldrin > > > > > > > > > > > > > > > > > >
