On 26. August 2014 20:09:19 MESZ, Adrian Custer <[email protected]> wrote:
>On 8/26/14 11:14 AM, Hanno Schlichting wrote:
>> Hi.
>>
>> It’s been a long time in the planning, but we are finally getting
>> closer to actually making the aggregated cell network data available
>> for download.
>>
>> We have worked with the OpenCellID project to agree on a new shared
>> export format, to make it easier for anyone using either of our two
>> data sources. The details of the new data format are documented at
>> http://mozilla-ichnaea.readthedocs.org/en/latest/import_export.html.
>>
>> As a concrete example, I’ve exported some of the most recent cell
>> networks from our live database. You can get the sample at:
>>
>https://www.dropbox.com/s/vcmjuozhv0fjpmm/MLS-diff-cell-export-2014-08-26T130000.csv.gz?dl=0
>>
>>  The data is licensed under CC-0 terms, so has neither copyright nor
>> database right restrictions
>> (https://creativecommons.org/publicdomain/zero/1.0/).
>>
>> If you haven’t followed the github issues about this topic, now’s
>> your time to share feedback and concerns.
>>
>> If all goes well, we are looking at adding a public downloads section
>> to the website next Tuesday and making all of our cell network data
>> available.
>>
>> Best, Hanno _______________________________________________
>> dev-geolocation mailing list [email protected]
>> https://lists.mozilla.org/listinfo/dev-geolocation
>>
>
>Hello all,
>
>Hanno's email was timely since I was about to send a mail asking about 
>this API effort. I do have "Feedback and Concerns"; here are some.
>
>
>
>At the process level, a week before launch is *very* late to be asking 
>for feedback on a public API! The GeoLocation Web API and the Mozilla 
>Location Service upload API both would have benefited from some good, 
>structured logic to avoid their bad structure and naming. I guess the 
>discussion on this API was all happening on GitHub issues and not on 
>this list. For something 'a long time in the planning' though, this 
>public notice is sadly late.
>
>
>
>The Internet end of line separator is [carriage-return, line feed] as 
>per all IETF standards.
>    The Text/Plain media type is the lowest common denominator of
>    Internet email, with lines of no more than 998 characters (by
>    convention usually no more than 78), and where the carriage-return
>  and line-feed (CRLF) sequence represents a line break (see [MIME-IMT]
>    and [MSG-FMT]).
>                            http://tools.ietf.org/html/rfc3676
>This started with email, was kept for HTTP, and, absent strong reasons 
>to change it, sticking with the standard is best policy.
>
>
>
>The proposed API needs work: the semantics are a mishmash and the
>naming 
>is terrible. The page name 'import-export' is in direct conflict with 
>the API structure which appears to only have been thought out for 
>export. Here I only discuss export because developing an API that can 
>serve both will take way more time and we might as well start
>somewhere. 
>Nonetheless, the clearer the name and documentation, the more reusable 
>the element.
>
>
>Semantically, the API is offering a set of individual data records each
>
>of which consists of:
>   *a set of labels which jointly identify an individual antennae
>   *known properties of the antennae
>   *measurements of that antennae
>   *estimated properties of that antennae
>   *record metadata
>Unfortunately, neither the names nor the documentations properly 
>separate out these roles.
>
>
>Let's walk the proposal:
>
>
>
>   'mcc'    okay but -> 'mobCountryId' to match others
>   'net'    bad name -> 'mobProviderId'
>   'area'   bad name -> 'mobAreaId'
>   'cell'   bad name -> 'mobCellId'
>   'unit'   bad name -? 'mobSubUnitId'
>
>Come on! 'net', 'area', 'cell', 'unit' have generic meaning in the
>world 
>that has nothing to do with your API. Put in a little more effort to 
>your naming, please! Save your users some headaches.
>
>All of these seem, from what I can tell, to be code identifiers which, 
>JOINTLY, label the specific radio antennae which is the subject of the 
>data record. Semantically we really have
>   'antenneId' : 'mcc'&'net'&'area'&'cell'&'unit'
>but here we use five fields instead of one. Fine, but this needs clear 
>documentation. In a JSON API these would properly be jointly in a 
>sub-structure but since this is a flat API we just need clarity in the 
>documentation stating that jointly these identifiers will provide a 
>unique label for each record.
>
>Ideally, these names would all have form 'id...' but English places its
>
>adjectives first 'red car' (versus 'voiture rouge' in French or 'coche 
>rojo' in Spanish) so we end up with a structure '...Id'. My proposed 
>shared prefix 'mob' helps clarify these fields are all similar and work
>
>jointly.
>
>
>
>
>   'radio'  bad name -> 'radioClass' or some such
>
>The current MozLocService item upload has a similar crappy naming 
>approach where each item has a 'radio' element but then each observed 
>cell in the item also has its own 'radio' element, of course with 
>different data. So I have to have this ridiculous lookup object:
>var CELL_TYPE_LOOKUP = {
>     'type':   ['cell.radio', 'item.radio'],//Header field
>
>     'gsm':    ['gsm',      'gsm'], //1G GSM
>     'edge':   ['gsm',      'gsm'], //2G EDGE
>     'gprs':   ['gsm',      'gsm'], //2G GPRS
>     'umts':   ['umts',     'gsm'], //3G UMTS
>     'hspa':   ['umts',     'gsm'], //3.5G HSDPA
>     'hsdpa':  ['umts',     'gsm'], //3.5G HSDPA
>     'hspa+':  ['umts',     'gsm'], //3.5G HSDP+
>     'hsupa':  ['umts',     'gsm'], //3.5G HSDPA
>
>     'cdma':   ['cdma',     'cdma'], //1G CDMA
>     'is95a':  ['cdma',     'cdma'], //2G CDMA
>     'is95b':  ['cdma',     'cdma'], //2G CDMA
>     '1xrtt':  ['cdma',     'cdma'], //2G CDMA
>     'evdo0':  ['cdma',     'cdma'], //3G CDMA
>     'evdoa':  ['cdma',     'cdma'], //3G CDMA
>     'evdob':  ['cdma',     'cdma'], //3G CDMA
>     'ehrpd':  ['cdma',     'cdma'], //4G CDMA
>
>     'lte':    ['lte',      'gsm']   //4G LTE
>}
>to generate what is required. I take it this proposed API element is
>the 
>middle column. I have taken to naming the first column 'radioType,' the
>
>second 'radioClass', and the third 'radioFamily' but these names are 
>arbitrary. First you need to decide on your name and then you need a 
>bunch more documentation providing essentially this lookup table to 
>explain this to users.
>
>
>
>
>
>   'lon'
>   'lat'
>
>The documentation should mention the Coordinate Reference System for 
>these as being the CRS used by the GPS system, i.e. WGS84. "The prime 
>meridian is 0 degrees" is a tautology---that's what 'prime meridian' 
>means. More properly, this could be "The Prime Meridian (with value 0 
>degrees) is the IERS Reference Meridian, close to, but not the same as,
>
>the Greenwich Airy Meridian."
>     https://en.wikipedia.org/wiki/World_Geodetic_System
>     http://spatialreference.org/ref/epsg/4326/
>
>
>   'changeable' terrible name
>
>As far as I can tell, this only applies to the location of the antennae
>
>so the name needs to be linked to the position. From the consumer stand
>
>point, the only thing interesting is how the position has been 
>'determined': either defined or estimated, and if the latter probably 
>the user wants some notion of how it was estimated. This could be done 
>in a single field or in two, depending on what you want
>
>                -> 'posEstimationMethod' DEFINED || CENTROID || ALGO_6
>or
>                -> 'posDetermination'    DEFINED || ESTIMATED
>                -> 'posEstimationAlgo'   MEASURED|| CENTRIOID || ...
>
>
>The best way, given the variety of algorithms possible, would be to 
>define a few and then use an HTTP URI (i.e. an URL) for the rest where 
>the link is to a web page with the description of the estimation 
>algorithm or process. Otherwise the documentation needs some indication
>
>of how the position estimation were derived.
>
>Are you punting completely on giving any estimate of the accuracy of
>the 
>position? I would expect a
>
>   'posAccuracy'
>
>giving a 95% CI radius around the observation since that is the crucial
>
>factor which makes the position usable or not. (The only other element 
>of your API that would let me guess as to the quality of the data would
>
>be the number of observations but this does not let me know if they
>were 
>all in a line or were well distributed spatially.) Since the service, 
>which has all the data, is the only one who can properly make this 
>estimate, it seems this should be generated for each record.
>
>
>
>
>   'range'  bad name -? 'rangeEstimate'
>
>Conceptually, this is an estimate of the distance at which the signal 
>level drops below some particular strength, perhaps usable strength. So
>
>the documentation should explain that. Of course, for different radio 
>technologies the threshold strength is probably different, so what is 
>this really? Is this a property of the radioClass or is this an
>estimate 
>based on the observations?
>
>
>   'samples' bad name ->  'obsNumber' or 'numObs' or 'numSamples'
>
>The name 'samples' suggests it is the samples themselves but it is 
>actually a number. The text says it is the number of observations used 
>to determine the position but we have already seen the position might 
>have been defined. So the documentation needs to be clear what other 
>entries are based on these observations: i.e. the 'range' or 
>'averageSignal'.
>
>
>   'averageSignal' -> !?
>
>Ouch. Hmm. What is this telling us about? Is this to help us estimate 
>the quality of the observations or to help us estimate the quality of 
>the position estimate? 'Max', 'Median', and 'Min' might help with the 
>former; some kind of referent of 'MaxEverForRadioClass' and 
>'MinEverForRadioClass' in the documentation would be needed for the 
>latter. A straight mathematical average for a 2D spatial estimate is 
>crazy problematic to interpret a posteriori so I am really not sure
>what 
>this is supposed to provide users. Some clarity of the usage of this 
>number and its behviour in the field is needed in the documentation.
>
>
>
>   'created'
>   'updated'
>
>Are these purely database modification times or are these related to
>the 
>observations? If the latter, 'firstObserved' and 'lastObserved' would
>be 
>better names.
>
>Why make it an ambiguous timestamp, when you can make it an unambiguous
>
>ISO 8601 Date (e.g. 2014-07-24T12:16:36Z)?
>
>
>
>
>
>This is not the API I would have expected.
>
>Without one or a few ways to estimate the accuracy of the position, 
>these records are of little use for positioning. Without a richer 
>description of the spatial structure of the observations, like bounding
>
>boxes or partial bounding boxes, these records are of little use in 
>defining the quality of the overall database. So we are left with being
>
>able to get summary records which neither provide a well defined 
>estimate of position and other values nor provide a rich summary of the
>
>data. As it stands, this API encourages direct, uncritical use of the 
>positions; since OpenCellId estimates several antennae as being in the 
>middle of the ocean, this is not great.
>
>Have you developed a set of usage examples for this API? Are those 
>written up some where? What is the goal of such usages? I have a 
>difficult time guessing as to the motivations which led to such an API.
>
>~adrian
>_______________________________________________
>dev-geolocation mailing list
>[email protected]
>https://lists.mozilla.org/listinfo/dev-geolocation

To point this out further:
We need a column that tells us whether the cell's signal is radial or whether 
it covers only a sector
For that sector we would need an angle and a direction (as an angel).

Regards,
Felix

_______________________________________________
dev-geolocation mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-geolocation

Reply via email to