So if I send 3 consecutive events with different arbitrary fields, does this schema update 3 times consecutively? How often server get events that have a different arbitrary map? Can we expect each event have a different arbitrary map situations?
Regards, Gihan On Wed, Dec 2, 2015 at 4:53 PM, Malith Dhanushka <[email protected]> wrote: > > > On Wed, Dec 2, 2015 at 4:47 PM, Sinthuja Ragendran <[email protected]> > wrote: > >> Hi Malith, >> >> On Wed, Dec 2, 2015 at 4:41 PM, Malith Dhanushka <[email protected]> wrote: >> >>> Hi Folks, >>> >>> We had an offline chat about this. >>> >>> Since indexing all the arbitrary fields is not feasible with the current >>> architecture, requirement of indexing arbitrary fields in log analyzer will >>> be handled in Log analyzer REST API. Idea is to compare the incoming event >>> with existing schema which is kept in in-memory and if there is a change >>> then to update the table schema. >>> >> >> In this case, all the fields are going to be indexed? Is there any way >> with this solution to say I need specific fields (say x, y, z) to be >> indexed in the log event and not all the fields? >> > > No. In this way client wont send the table schema before hand. Up on the > change of an event , REST API will dynamically update the schema. Since > this is log analyzer specific scenario , all the events needs to be > indexed. > > Thanks > >> >> Thanks, >> Sinthuja. >> >>> >>> Overriding table schema will make event sink configuration inconsistent >>> with table schema. To avoid that event sink feature needs to be improved in >>> order to support merging table schemas. For that event persist feature >>> should have a flag to enable/disable merging table schemas. >>> >>> Thanks, >>> >>> On Wed, Dec 2, 2015 at 1:30 PM, Sinthuja Ragendran <[email protected]> >>> wrote: >>> >>>> Hi, >>>> >>>> On Wed, Dec 2, 2015 at 11:05 AM, Anjana Fernando <[email protected]> >>>> wrote: >>>> >>>>> On Wed, Dec 2, 2015 at 10:17 AM, Sachith Withana <[email protected]> >>>>> wrote: >>>>> >>>>>> Now that we are using logstash out of the box, without the >>>>>> DASConnector, it won't do that. >>>>>> >>>>>> The logstash would just start publishing and with the current design, >>>>>> AFAIK the schema setting would be handled by the LAS server, >>>>>> >>>>> >>>>> Oh yeah, I see .. >>>>> >>>>> >>>>>> >>>>>> BTW for that requirement, can we provide a way to allow indexing all >>>>>> the columns? >>>>>> >>>>> >>>>> Well .. we can .. I guess this is the same that Malith request in the >>>>> first mail. Only thing is, we have to change the internals/architecture of >>>>> how we do indexing currently, the current logic is, we check the input >>>>> value against the table schema, and do the required indexing. For example, >>>>> if facets are defined, data types etc.. so if we are just saying, to index >>>>> all fields, it will be a new path there, and also we have to introduce a >>>>> new special flag for a table to say, index all. Also, we should need some >>>>> mechanism of figuring out the fields of a specific log type in the server, >>>>> where at least with the table schema, we knew what are all the fields >>>>> that's there for all the log types. Ideally, we need to store some >>>>> metadata >>>>> somewhere saying, for this specific log type, these are the fields and so >>>>> on. Do we get some kind of a log category/type information with the >>>>> standard logstash HTTP connector? .. any other schema setting, storing of >>>>> metadata can be done in the server side, and we can cache it in-memory to >>>>> do fast lookups and modifications of the schema (together with some >>>>> cluster >>>>> messaging to keep it in-sync with other nodes). >>>>> >>>>> Or else, maybe we are again back to writing our own logstash adapter >>>>> which will make the whole thing much simpler? .. >>>>> >>>> >>>> Yeah +1 , actually I was also thinking having our own logstash adaptor >>>> will be more better and cleaner way without complicating much. :) Simply if >>>> we are able to mention what are the fields that needs to be indexed in >>>> client side, and then make a call to LAS REST service before publishing >>>> data, then we can set the schema accordingly and things will work without >>>> any big effort . >>>> >>>> Thanks, >>>> Sinthuja. >>>> >>>> >>>>> Cheers, >>>>> Anjana. >>>>> >>>>> >>>>>> >>>>>> On Wed, Dec 2, 2015 at 10:11 AM, Anjana Fernando <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi Sachith, >>>>>>> >>>>>>> Doesn't the agent have the knowledge of the log types/categories and >>>>>>> their field information when it is initializing? .. as in, as I >>>>>>> understood, >>>>>>> we give what fields needs to be sent out in the configurations, isn't >>>>>>> that >>>>>>> the case? .. >>>>>>> >>>>>>> Cheers, >>>>>>> Anjana. >>>>>>> >>>>>>> On Wed, Dec 2, 2015 at 10:01 AM, Sachith Withana <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi All, >>>>>>>> >>>>>>>> There might be a slight issue. We wouldn't know the arbitrary >>>>>>>> fields before the log agent starts publishing, since the agent only >>>>>>>> publishes and we don't have control over which fields would be sent ( >>>>>>>> unless we configure all the agents ourselves). So we would have to >>>>>>>> check >>>>>>>> for each event, if there are new fields apart from that are there in >>>>>>>> the >>>>>>>> schema. This is undesirable. >>>>>>>> >>>>>>>> And as Anjana pointed out we don't have a way to specify to index >>>>>>>> all the arbitrary values unless we set the schema accordingly. >>>>>>>> >>>>>>>> Is it possible to specify in the schema to index everything? >>>>>>>> >>>>>>>> On Wed, Dec 2, 2015 at 9:38 AM, Anjana Fernando <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Malith, >>>>>>>>> >>>>>>>>> The functionality which you're requesting is very specific, and >>>>>>>>> from DAS side, it doesn't make sense to implement this in a generic >>>>>>>>> way, >>>>>>>>> which is not used usually. And it is anyway not the way, the log >>>>>>>>> analyzer >>>>>>>>> should use it. The different log sources, will know their fields >>>>>>>>> before >>>>>>>>> they send out data, it doesn't have to be checked every time an event >>>>>>>>> is >>>>>>>>> published. A log source would instruct the log analyzer backend API, >>>>>>>>> the >>>>>>>>> new fields, this specific log source will be sending, and with the >>>>>>>>> earlier >>>>>>>>> message, the backend service will set the global table's schema >>>>>>>>> properly, >>>>>>>>> and then the remote log agent will be sending out log records to be >>>>>>>>> processed by the server. >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Anjana. >>>>>>>>> >>>>>>>>> On Tue, Dec 1, 2015 at 6:44 PM, Malith Dhanushka <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi Anjana, >>>>>>>>>> >>>>>>>>>> Yes. Requirement is for the internal log related REST API which >>>>>>>>>> is being written using osgi services. In the perspective of log >>>>>>>>>> analysis >>>>>>>>>> data, we have one master table to persist all the log events from >>>>>>>>>> different >>>>>>>>>> log sources. The way log data comes in to log REST API is as >>>>>>>>>> arbitrary >>>>>>>>>> fields. So different log sources have different set of arbitrary >>>>>>>>>> fields >>>>>>>>>> which leads log REST API to change the schema of master table every >>>>>>>>>> time it >>>>>>>>>> receives log events from a new/updated log source. That's what i >>>>>>>>>> meant >>>>>>>>>> inaccurate which can be solved much cleaner way by having that flag >>>>>>>>>> to >>>>>>>>>> index or not to index arbitrary fields for a particular stream. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Malith >>>>>>>>>> >>>>>>>>>> On Tue, Dec 1, 2015 at 6:06 PM, Anjana Fernando <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Malith, >>>>>>>>>>> >>>>>>>>>>> No, it cannot be done like that. How the indexing and all >>>>>>>>>>> happens is, it looks up the table schema for a table and do the >>>>>>>>>>> indexing >>>>>>>>>>> according to that. So the table schema must be set before hand. It >>>>>>>>>>> is not a >>>>>>>>>>> dynamic thing that can be set, when arbitrary fields are sent to the >>>>>>>>>>> receiver, and it cannot always load the current schema and set it >>>>>>>>>>> always >>>>>>>>>>> for each event, even though we can cache that information and do >>>>>>>>>>> some >>>>>>>>>>> operations, but that gets complicated. So the idea is, it is the >>>>>>>>>>> responsibility of the client to set the target table's schema >>>>>>>>>>> properly >>>>>>>>>>> before hand, which may or may not include arbitrary fields, and >>>>>>>>>>> then send >>>>>>>>>>> the data. >>>>>>>>>>> >>>>>>>>>>> Also, if this requirement is for the log analytics solution >>>>>>>>>>> work, as we've discussed before, there should be a whole new remote >>>>>>>>>>> API for >>>>>>>>>>> that, and that API can do these operations inside the server, using >>>>>>>>>>> the >>>>>>>>>>> OSGi services, and not the original DAS REST API. So those >>>>>>>>>>> operations will >>>>>>>>>>> happen automatically while keeping the remote log related API clean. >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> Anjana. >>>>>>>>>>> >>>>>>>>>>> On Tue, Dec 1, 2015 at 5:13 PM, Malith Dhanushka < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Folks, >>>>>>>>>>>> >>>>>>>>>>>> Currently indexing arbitrary fields is being achieved by >>>>>>>>>>>> dynamically updating analytics table schema through analytics REST >>>>>>>>>>>> API. >>>>>>>>>>>> This is not an accurate solution for a frequently updating schema. >>>>>>>>>>>> So the >>>>>>>>>>>> ideal solution would be to have a flag in data bridge event sink >>>>>>>>>>>> configuration to enable/disable indexing for all arbitrary fields. >>>>>>>>>>>> >>>>>>>>>>>> WDUT? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Malith >>>>>>>>>>>> -- >>>>>>>>>>>> Malith Dhanushka >>>>>>>>>>>> Senior Software Engineer - Data Technologies >>>>>>>>>>>> *WSO2, Inc. : wso2.com <http://wso2.com/>* >>>>>>>>>>>> *Mobile* : +94 716 506 693 >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> *Anjana Fernando* >>>>>>>>>>> Senior Technical Lead >>>>>>>>>>> WSO2 Inc. | http://wso2.com >>>>>>>>>>> lean . enterprise . middleware >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Malith Dhanushka >>>>>>>>>> Senior Software Engineer - Data Technologies >>>>>>>>>> *WSO2, Inc. : wso2.com <http://wso2.com/>* >>>>>>>>>> *Mobile* : +94 716 506 693 >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> *Anjana Fernando* >>>>>>>>> Senior Technical Lead >>>>>>>>> WSO2 Inc. | http://wso2.com >>>>>>>>> lean . enterprise . middleware >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Sachith Withana >>>>>>>> Software Engineer; WSO2 Inc.; http://wso2.com >>>>>>>> E-mail: sachith AT wso2.com >>>>>>>> M: +94715518127 >>>>>>>> Linked-In: <http://goog_416592669> >>>>>>>> https://lk.linkedin.com/in/sachithwithana >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Anjana Fernando* >>>>>>> Senior Technical Lead >>>>>>> WSO2 Inc. | http://wso2.com >>>>>>> lean . enterprise . middleware >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Sachith Withana >>>>>> Software Engineer; WSO2 Inc.; http://wso2.com >>>>>> E-mail: sachith AT wso2.com >>>>>> M: +94715518127 >>>>>> Linked-In: <http://goog_416592669> >>>>>> https://lk.linkedin.com/in/sachithwithana >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> *Anjana Fernando* >>>>> Senior Technical Lead >>>>> WSO2 Inc. | http://wso2.com >>>>> lean . enterprise . middleware >>>>> >>>> >>>> >>>> >>>> -- >>>> *Sinthuja Rajendran* >>>> Associate Technical Lead >>>> WSO2, Inc.:http://wso2.com >>>> >>>> Blog: http://sinthu-rajan.blogspot.com/ >>>> Mobile: +94774273955 >>>> >>>> >>>> >>> >>> >>> -- >>> Malith Dhanushka >>> Senior Software Engineer - Data Technologies >>> *WSO2, Inc. : wso2.com <http://wso2.com/>* >>> *Mobile* : +94 716 506 693 >>> >> >> >> >> -- >> *Sinthuja Rajendran* >> Associate Technical Lead >> WSO2, Inc.:http://wso2.com >> >> Blog: http://sinthu-rajan.blogspot.com/ >> Mobile: +94774273955 >> >> >> > > > -- > Malith Dhanushka > Senior Software Engineer - Data Technologies > *WSO2, Inc. : wso2.com <http://wso2.com/>* > *Mobile* : +94 716 506 693 > -- W.G. Gihan Anuruddha Senior Software Engineer | WSO2, Inc. M: +94772272595
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
