So if I send 3 consecutive events with different arbitrary fields, does
this schema update 3 times consecutively? How often server get events that
have a different arbitrary map? Can we expect each event have a different
arbitrary map situations?

Regards,
Gihan

On Wed, Dec 2, 2015 at 4:53 PM, Malith Dhanushka <[email protected]> wrote:

>
>
> On Wed, Dec 2, 2015 at 4:47 PM, Sinthuja Ragendran <[email protected]>
> wrote:
>
>> Hi Malith,
>>
>> On Wed, Dec 2, 2015 at 4:41 PM, Malith Dhanushka <[email protected]> wrote:
>>
>>> Hi Folks,
>>>
>>> We had an offline chat about this.
>>>
>>> Since indexing all the arbitrary fields is not feasible with the current
>>> architecture, requirement of indexing arbitrary fields in log analyzer will
>>> be handled in Log analyzer REST API. Idea is to compare the incoming event
>>> with existing schema which is kept in in-memory and if there is a change
>>> then to update the table schema.
>>>
>>
>> In this case, all the fields are going to be indexed? Is there any way
>> with this solution to say I need specific fields (say x, y, z) to be
>> indexed in the log event and not all the fields?
>>
>
> No. In this way client wont send the table schema before hand. Up on the
> change of an event , REST API will dynamically update the schema.  Since
> this is log analyzer specific scenario , all the events needs to be
> indexed.
>
> Thanks
>
>>
>> Thanks,
>> Sinthuja.
>>
>>>
>>> Overriding table schema will make event sink configuration inconsistent
>>> with table schema. To avoid that event sink feature needs to be improved in
>>> order to support merging table schemas. For that event persist feature
>>> should have a flag to enable/disable merging table schemas.
>>>
>>> Thanks,
>>>
>>> On Wed, Dec 2, 2015 at 1:30 PM, Sinthuja Ragendran <[email protected]>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> On Wed, Dec 2, 2015 at 11:05 AM, Anjana Fernando <[email protected]>
>>>> wrote:
>>>>
>>>>> On Wed, Dec 2, 2015 at 10:17 AM, Sachith Withana <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Now that we are using logstash out of the box, without the
>>>>>> DASConnector, it won't do that.
>>>>>>
>>>>>> The logstash would just start publishing and with the current design,
>>>>>> AFAIK the schema setting would be handled by the LAS server,
>>>>>>
>>>>>
>>>>> Oh yeah, I see ..
>>>>>
>>>>>
>>>>>>
>>>>>> BTW for that requirement, can we provide a way to allow indexing all
>>>>>> the columns?
>>>>>>
>>>>>
>>>>> Well .. we can .. I guess this is the same that Malith request in the
>>>>> first mail. Only thing is, we have to change the internals/architecture of
>>>>> how we do indexing currently, the current logic is, we check the input
>>>>> value against the table schema, and do the required indexing. For example,
>>>>> if facets are defined, data types etc.. so if we are just saying, to index
>>>>> all fields, it will be a new path there, and also we have to introduce a
>>>>> new special flag for a table to say, index all. Also, we should need some
>>>>> mechanism of figuring out the fields of a specific log type in the server,
>>>>> where at least with the table schema, we knew what are all the fields
>>>>> that's there for all the log types. Ideally, we need to store some 
>>>>> metadata
>>>>> somewhere saying, for this specific log type, these are the fields and so
>>>>> on. Do we get some kind of a log category/type information with the
>>>>> standard logstash HTTP connector? .. any other schema setting, storing of
>>>>> metadata can be done in the server side, and we can cache it in-memory to
>>>>> do fast lookups and modifications of the schema (together with some 
>>>>> cluster
>>>>> messaging to keep it in-sync with other nodes).
>>>>>
>>>>> Or else, maybe we are again back to writing our own logstash adapter
>>>>> which will make the whole thing much simpler? ..
>>>>>
>>>>
>>>> Yeah +1 , actually I was also thinking having our own logstash adaptor
>>>> will be more better and cleaner way without complicating much. :) Simply if
>>>> we are able to mention what are the fields that needs to be indexed in
>>>> client side, and then make a call to LAS REST service before publishing
>>>> data, then we can set the schema accordingly and things will work without
>>>> any big effort .
>>>>
>>>> Thanks,
>>>> Sinthuja.
>>>>
>>>>
>>>>> Cheers,
>>>>> Anjana.
>>>>>
>>>>>
>>>>>>
>>>>>> On Wed, Dec 2, 2015 at 10:11 AM, Anjana Fernando <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Sachith,
>>>>>>>
>>>>>>> Doesn't the agent have the knowledge of the log types/categories and
>>>>>>> their field information when it is initializing? .. as in, as I 
>>>>>>> understood,
>>>>>>> we give what fields needs to be sent out in the configurations, isn't 
>>>>>>> that
>>>>>>> the case? ..
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Anjana.
>>>>>>>
>>>>>>> On Wed, Dec 2, 2015 at 10:01 AM, Sachith Withana <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi All,
>>>>>>>>
>>>>>>>> There might be a slight issue. We wouldn't know the arbitrary
>>>>>>>> fields before the log agent starts publishing, since the agent only
>>>>>>>> publishes and we don't have control over which fields would be sent (
>>>>>>>> unless we configure all the agents ourselves). So we would have to 
>>>>>>>> check
>>>>>>>> for each event, if there are new fields apart from that are there in 
>>>>>>>> the
>>>>>>>> schema. This is undesirable.
>>>>>>>>
>>>>>>>> And as Anjana pointed out we don't have a way to specify to index
>>>>>>>> all the arbitrary values unless we set the schema accordingly.
>>>>>>>>
>>>>>>>> Is it possible to specify in the schema to index everything?
>>>>>>>>
>>>>>>>> On Wed, Dec 2, 2015 at 9:38 AM, Anjana Fernando <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Malith,
>>>>>>>>>
>>>>>>>>> The functionality which you're requesting is very specific, and
>>>>>>>>> from DAS side, it doesn't make sense to implement this in a generic 
>>>>>>>>> way,
>>>>>>>>> which is not used usually. And it is anyway not the way, the log 
>>>>>>>>> analyzer
>>>>>>>>> should use it. The different log sources, will know their fields 
>>>>>>>>> before
>>>>>>>>> they send out data, it doesn't have to be checked every time an event 
>>>>>>>>> is
>>>>>>>>> published. A log source would instruct the log analyzer backend API, 
>>>>>>>>> the
>>>>>>>>> new fields, this specific log source will be sending, and with the 
>>>>>>>>> earlier
>>>>>>>>> message, the backend service will set the global table's schema 
>>>>>>>>> properly,
>>>>>>>>> and then the remote log agent will be sending out log records to be
>>>>>>>>> processed by the server.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Anjana.
>>>>>>>>>
>>>>>>>>> On Tue, Dec 1, 2015 at 6:44 PM, Malith Dhanushka <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Anjana,
>>>>>>>>>>
>>>>>>>>>> Yes. Requirement is for the internal log related REST API which
>>>>>>>>>> is being written using osgi services. In the perspective of log 
>>>>>>>>>> analysis
>>>>>>>>>> data, we have one master table to persist all the log events from 
>>>>>>>>>> different
>>>>>>>>>> log sources. The way log data comes in to log REST API is as 
>>>>>>>>>> arbitrary
>>>>>>>>>> fields. So different log sources have different set of arbitrary 
>>>>>>>>>> fields
>>>>>>>>>> which leads log REST API to change the schema of master table every 
>>>>>>>>>> time it
>>>>>>>>>> receives log events from a new/updated log source. That's what i 
>>>>>>>>>> meant
>>>>>>>>>> inaccurate which can be solved much cleaner way by having that flag 
>>>>>>>>>> to
>>>>>>>>>> index or not to index arbitrary fields for a particular stream.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Malith
>>>>>>>>>>
>>>>>>>>>> On Tue, Dec 1, 2015 at 6:06 PM, Anjana Fernando <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Malith,
>>>>>>>>>>>
>>>>>>>>>>> No, it cannot be done like that. How the indexing and all
>>>>>>>>>>> happens is, it looks up the table schema for a table and do the 
>>>>>>>>>>> indexing
>>>>>>>>>>> according to that. So the table schema must be set before hand. It 
>>>>>>>>>>> is not a
>>>>>>>>>>> dynamic thing that can be set, when arbitrary fields are sent to the
>>>>>>>>>>> receiver, and it cannot always load the current schema and set it 
>>>>>>>>>>> always
>>>>>>>>>>> for each event, even though we can cache that information and do 
>>>>>>>>>>> some
>>>>>>>>>>> operations, but that gets complicated. So the idea is, it is the
>>>>>>>>>>> responsibility of the client to set the target table's schema 
>>>>>>>>>>> properly
>>>>>>>>>>> before hand, which may or may not include arbitrary fields, and 
>>>>>>>>>>> then send
>>>>>>>>>>> the data.
>>>>>>>>>>>
>>>>>>>>>>> Also, if this requirement is for the log analytics solution
>>>>>>>>>>> work, as we've discussed before, there should be a whole new remote 
>>>>>>>>>>> API for
>>>>>>>>>>> that, and that API can do these operations inside the server, using 
>>>>>>>>>>> the
>>>>>>>>>>> OSGi services, and not the original DAS REST API. So those 
>>>>>>>>>>> operations will
>>>>>>>>>>> happen automatically while keeping the remote log related API clean.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Anjana.
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Dec 1, 2015 at 5:13 PM, Malith Dhanushka <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Folks,
>>>>>>>>>>>>
>>>>>>>>>>>> Currently indexing arbitrary fields is being achieved by
>>>>>>>>>>>> dynamically updating analytics table schema through analytics REST 
>>>>>>>>>>>> API.
>>>>>>>>>>>> This is not an accurate solution for a frequently updating schema. 
>>>>>>>>>>>> So the
>>>>>>>>>>>> ideal solution would be to have a flag in data bridge event sink
>>>>>>>>>>>> configuration to enable/disable indexing for all arbitrary fields.
>>>>>>>>>>>>
>>>>>>>>>>>> WDUT?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Malith
>>>>>>>>>>>> --
>>>>>>>>>>>> Malith Dhanushka
>>>>>>>>>>>> Senior Software Engineer - Data Technologies
>>>>>>>>>>>> *WSO2, Inc. : wso2.com <http://wso2.com/>*
>>>>>>>>>>>> *Mobile*          : +94 716 506 693
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> *Anjana Fernando*
>>>>>>>>>>> Senior Technical Lead
>>>>>>>>>>> WSO2 Inc. | http://wso2.com
>>>>>>>>>>> lean . enterprise . middleware
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Malith Dhanushka
>>>>>>>>>> Senior Software Engineer - Data Technologies
>>>>>>>>>> *WSO2, Inc. : wso2.com <http://wso2.com/>*
>>>>>>>>>> *Mobile*          : +94 716 506 693
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> *Anjana Fernando*
>>>>>>>>> Senior Technical Lead
>>>>>>>>> WSO2 Inc. | http://wso2.com
>>>>>>>>> lean . enterprise . middleware
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Sachith Withana
>>>>>>>> Software Engineer; WSO2 Inc.; http://wso2.com
>>>>>>>> E-mail: sachith AT wso2.com
>>>>>>>> M: +94715518127
>>>>>>>> Linked-In: <http://goog_416592669>
>>>>>>>> https://lk.linkedin.com/in/sachithwithana
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> *Anjana Fernando*
>>>>>>> Senior Technical Lead
>>>>>>> WSO2 Inc. | http://wso2.com
>>>>>>> lean . enterprise . middleware
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sachith Withana
>>>>>> Software Engineer; WSO2 Inc.; http://wso2.com
>>>>>> E-mail: sachith AT wso2.com
>>>>>> M: +94715518127
>>>>>> Linked-In: <http://goog_416592669>
>>>>>> https://lk.linkedin.com/in/sachithwithana
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> *Anjana Fernando*
>>>>> Senior Technical Lead
>>>>> WSO2 Inc. | http://wso2.com
>>>>> lean . enterprise . middleware
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *Sinthuja Rajendran*
>>>> Associate Technical Lead
>>>> WSO2, Inc.:http://wso2.com
>>>>
>>>> Blog: http://sinthu-rajan.blogspot.com/
>>>> Mobile: +94774273955
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Malith Dhanushka
>>> Senior Software Engineer - Data Technologies
>>> *WSO2, Inc. : wso2.com <http://wso2.com/>*
>>> *Mobile*          : +94 716 506 693
>>>
>>
>>
>>
>> --
>> *Sinthuja Rajendran*
>> Associate Technical Lead
>> WSO2, Inc.:http://wso2.com
>>
>> Blog: http://sinthu-rajan.blogspot.com/
>> Mobile: +94774273955
>>
>>
>>
>
>
> --
> Malith Dhanushka
> Senior Software Engineer - Data Technologies
> *WSO2, Inc. : wso2.com <http://wso2.com/>*
> *Mobile*          : +94 716 506 693
>



-- 
W.G. Gihan Anuruddha
Senior Software Engineer | WSO2, Inc.
M: +94772272595
_______________________________________________
Dev mailing list
[email protected]
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to