There already is Test Data, and several of the files were updated by you a
while ago:
http://svn.apache.org/viewvc/incubator/devicemap/trunk/data/test-data/

Currently this repository is ignored by the release proposal, it still has
an old draft version "0.99" or so.
If it was to be added to an official V1.0 tag, then the POM should also
match, and artifactId follow a similar pattern as the "main" data, e.g.
"devicemap-test-data" etc.

As long as you are not violating any IP or confidential information of your
customers, you could be OK to contribute there, but that's the same as with
everything else[?]

Werner

On Tue, Jul 8, 2014 at 11:53 AM, eberhard speer jr. <[email protected]>
wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Werner, 'guys'...
>
> most of the test data -- the close to 1 million ua-strings -- comes
> from 'me' in that depending on the agreements with my clients I'm free
> to 'publish' the ua-strings from their logs 'anonymously' in test data.
> And I'll be adding a new batch any day now.
>
> Currently, the 'deal' is anyone who sends me their web-access logs can
> count on :
>
> - - my *absolute* discretion...I could name you a few companies that
> send me their logs, but... ;-)
> - - a 'vanilla' analyses : web-traffic, security, user-agents,...
> - - me using the user-agent strings in test data-sets
> - - the source data being destroyed [wouldn't have the capacity to hang
> onto it anyway, some of these cloud provider access logs roll-over
> every other minute to keep the file size manageable...go figure]
>
> I have no problem contributing the test data [as long as the original
> donors don't]. I also have no problem contributing the 'know-how',
> code and underlying models : how to get from web-access log [any/all
> formats] to 'anonymized' yet trace-able set of User-Agent strings.
> But, as you can imagine, this is a Big Thing and steers DeviceMap into
> log analyses territory.
>
> I have that whole infra in .Net with a good 'old' sturdy normalized
> RDBMS in the back, as I do for the DDR device data.
> Having a "sturdy normalized RDBMS in the back" for the DDR device
> data, and clients to access it, is maybe a bigger priority now.
>
> Maybe in 1st instance we should consider setting up a test-data
> repository and later consider whether or not to go into log parsing
> territory. If and when we do, do count me in :-)
>
> esjr
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.8 (MingW32)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQEcBAEBAgAGBQJTu7+NAAoJEOxywXcFLKYc6wMIAIIIWh8NPrt+Hv3s1UUw3mQJ
> mjHiVKR+pdW66/v1s/JZJKthG6qgJKF2mYVWXYGTLp15d717/DiIfvjHHsifl5w+
> p4KP+Hj7nsE2rR1QV7LESoq++QLrLLk4OIdj1YL8DARv2MsbLKz1pbA34hgnGUE6
> nTn5uJn9r/r5JQOYnnL/IMk3fAsZEuRUKUNs4IwzN0xuTpL2qvexSzLOLB7063Tk
> K6SE18tt5i4y8zWIh6kYxNC1hsXIuPsDPxfdZDGW0uQWho1aIGoeMApU/VBmM8kD
> rTHLh+ABJBvTfWZf3Vc1UUaRMbpg/vdraPCMdO93tA34z3VnQ2Sq35SChn5yQtg=
> =4+wn
> -----END PGP SIGNATURE-----
>

Reply via email to