Re: the weak docs- Have we considered getting involved in this? [1]
If interested I can find out if Apache is going in as an org, or if we need to submit ourselves. I mentored google summer of code a while back. [1] https://developers.google.com/season-of-docs/docs/timeline On Tue, Apr 2, 2019 at 12:44 PM Frank Greguska <[email protected]> wrote: > Unfortunately we are pretty sparse at documentation at this point so I will > try to briefly summarize the work in that area here: > > The anomaly detection consists mainly of two parts, analysis of the data > itself and the ability to publish anomalies. > > In terms of analyzing the data, we have focused on an algorithm we refer to > as the "Daily Difference Average". In practice what this means is that for > a given user-selected area and timeframe, for each day we extract the > measurement values in the area on that day. Then we access a pre-computed > climatology for that dataset and extract the same area from the > climatology. Then we subtract the climatology data values from the data > values on that day. Finally we average the differences in values. In this > regard we can graph how far from "normal" (the climatology) measurements in > an area on a given day are. > > The algorithm is implemented here: > > https://github.com/apache/incubator-sdap-nexus/blob/49d7d43ea6c64e2d3055ab9af4ba07b948bbd2e1/analysis/webservice/algorithms_spark/DailyDifferenceAverageSpark.py > > An example of the resulting plot can be seen here where we plot the > difference from average sea surface temperature for the El Niño 3.4 region: > https://imgur.com/a/gRLSrv8 (attaching the image directly causes the > apache > mail server to reject the message so I've uploaded it to imgur). > > The ability to publish anomalies comes mainly from our Edge project: > https://github.com/apache/incubator-sdap-edge > In particular, the "oceanxtremes" plugin: > > https://github.com/apache/incubator-sdap-edge/tree/71d190599ca79591ef2bf2c116bfa86bc281059c/src/main/python/plugins/oceanxtremes > This allows users to submit "anomalies" that capture the parameters used > during the query so that other researchers can load up the exact same data > and have a look for themselves. It also integrates with datacasting ( > https://datacasting.jpl.nasa.gov/) which is an RSS style feed that > researchers could subscribe to in order to be notified of new anomalies. > > I believe that mostly summarizes the work done so far, if anyone else has > further input please share. > > Thanks, > > -Frank > > On Mon, Apr 1, 2019 at 12:34 PM Julian Feinauer < > [email protected]> wrote: > > > Hi Lewis, > > Hi Frank, > > > > Thank you! > > Of course I'll try to help with the release and RC checking. > > I'm very interested in the anomaly detection... But did not find that > much > > documentation about it. > > Could your point me towards it? > > > > Julian > > > > Von meinem Mobiltelefon gesendet > > > > > > -------- Ursprüngliche Nachricht -------- > > Betreff: Re: ... / Let me introduce myself > > Von: Frank Greguska > > An: [email protected] > > Cc: > > > > Welcome Julian, glad to have you on board. > > > > On Mon, Apr 1, 2019 at 8:25 AM Lewis John McGibbney <[email protected]> > > wrote: > > > > > Hi Julian, > > > Sounds great. > > > Is there any particular part of SDAP that your interested in? > > > The community is working towards its first incubating release so > > hopefully > > > you will be able to try it out soon. Reviewing the release candidate > when > > > it is prepared would be a real big help. > > > Lewis > > > > > > On 2019/03/29 08:49:40, Julian Feinauer <[email protected]> > > > wrote: > > > > Hi all, > > > > > > > > please excuse this mail.. this was some miscommunication between my > > mail > > > client and me during subscription to the list. > > > > So let me introduce myself for short… my name is Julian and I am > > > mathematician (did my PhD in stochastics). > > > > I live in germany and am the founder of a StartUp where we do a lot > of > > > data analytics, especially on “industrial data” and stream processing. > > > > I’m involved in other incubating projects in this area as well, > namely > > > PLC4X, Edgent and IoTDB (and I do minor contributions to other > projects). > > > > Thus, I am very interested in what you guys do here and can hopefully > > > contribute a bit : ) > > > > > > > > Julian > > > > > > > > On 2019/03/29 08:26:45, Julian Feinauer < > [email protected] > > > <mailto:[email protected]>> wrote: > > > > > > > > > > > > > > > > > > > >
