Thanks a bunch, Patrick! That cleared things up for me considerably. Best, Matthias
On 07/05/15 12:40, Patrick Niklaus wrote: >> - Did you implement all of the described HMM break conditions (route >> localization, low probability routes, GPS outliers)? After reading the >> code in OSRM, I was only able to find the "low probability routes" >> condition. Did I overlook something? > > The localization is implemented by choosing the candidates before we > start the algorithm. For each input point we adaptively chose between > 5 and 10 candidates based on the distance to the previous input point. > That part of the algorithm can be found in "plugins/match.hpp". The > outliers test is not implemented, I'm not sure it would add much value > over the limited search radius for candidates combined with the > pruning based on transition probability. > >> >> - As far as I understand, MAX_DISTANCE_DELTA corresponds to the delta >> when comparing the route length and great circle distance for the "low >> probability routes" condition. The paper states a delta of 2000m, the >> implementation uses a delta of 200m. Feature or bug? >> > > I found that 2000m is a little bit on the conservative side. At least > for my data 200m worked pretty well (sampling period was approximately > 7s). > Please not that most parameters are tuned for sampling periods of > around 5 to 10 seconds. > >> - What exactly does the "confidence" return value mean? >> > > Since we are dealing with real world data, matching will fail for some > traces. That might be cause the trace is too noisy or the data from > OpenStreetMap has problems like connectivity errors. To get a handle > on that I gathered some empirical data on mismatched traces and tried > to find a good feature to classify matchings are valid or invalid. The > feature that worked best for me was the ratio between trace length and > matching length (the intuition here is that invalid matchings tend to > contain "loops" where detours are taken). I used that labeled data to > fit a Laplacian distribution and constructed a naive Bayes classifier > based on that. > The "confidence" is the probability P(x \in valid). The values are > only based on ~800 labeled traces which specific sampling rate, so > take that value with a grain of salt for your data. > > What is missing is a good parameter selection based on the sample rate > of the input. Its not clear when I will have time again to do that > (for now massaging the data to fit the current constraints works quite > well). > > _______________________________________________ > OSRM-talk mailing list > [email protected] > https://lists.openstreetmap.org/listinfo/osrm-talk > -- Matthias Schwamborn University of Osnabrück Tel.: +49-541-969-7167 Institute of Computer Science Fax: +49-541-969-2799 Albrechtstr. 28 E-mail: [email protected] D-49076 Osnabrück, Germany http://cs.uos.de/schwamborn/
signature.asc
Description: OpenPGP digital signature
_______________________________________________ OSRM-talk mailing list [email protected] https://lists.openstreetmap.org/listinfo/osrm-talk
