Re: [mySociety:public] Calculate optimum mid-point w ith TfL data — challenge from b3ta.com

Paul Robinson Sat, 28 Aug 2010 12:30:42 -0700

On 28 Aug 2010, at 14:36, Tim Morley wrote:

> If any data munging wizard on this list has nothing better to do this 
> weekend, you could answer this plea from a user of b3ta.com, posted in their 
> latest newsletter:



It's a travelling salesman problem:

http://en.wikipedia.org/wiki/Travelling_salesman_problem

Despite it being two people meeting at one mid-point, from a mathematical point 
of view it's identical to saying "Given I am in Camden and my 2nd stop must be 
Clerkenwell, what is the optimal first stop?", and therefore it can be modelled 
as pure TSP.

As such, it's therefore - unfortunately - NP-hard. That's computer science 
speak for "Are you kidding me?" :-)

In this particular case, it might be possible to work it out relatively quickly 
based on:

- Time of day and scheduled services supposedly happening
- Total journey time

If you limited it to Tube/clear service/midday start/1pm finish, it starts to 
get something you could do on paper in an afternoon, and you could brute force 
it on a computer even quicker. 

If you wanted to build a tool to take into account bus routes and stops, 
service disruption "fallback meeting places", costs based on how Oyster would 
charge it, etc. and then tried to make it more generic to allow any two 
starting points and meeting place, it starts to scale rather horribly.

If somebody comes up with a generic solution applicable for any two points and 
any meeting point, using any public transport route, and in such a way that is 
computationally cheap, all done over some spare time over a weekend, I'd say 
they are likely to be the kind of person considered for the Fields medal or a 
Nobel prize at some point in their life. :-D

That said, public transport routes and stations do compose neat directional 
graphs, and I have been looking at brute-forcing this (the only simple way, 
albeit computationally pricey), with Manchester data. When I was in London the 
other week, I couldn't help but look at the tube map and start to see it as an 
undirected graph begging to be thrown at a few algorithms.

But then I costed doing it using EC2 compute units on the back of an envelope 
and realised I needed to save up a bit more for that quantum computer. :-)

You might say "but hang on, Traveline are able to do it with their Journey 
Planner applications!", and you'd be sort of right, but that's because they're 
using a naive graph algorithm that is based on the starting time of the journey 
and the timetable. I know because I've tried lots of different journeys in that 
tool and think I've worked out what they're doing.

The ATCO CIF data (and TransXChange) formats have the concept of hubs, so you 
can delete out a crap load of nodes in the graph by aiming for hubs, which 
simplifies it somewhat - it reduces the graph in Manchester from one of 
thousands of nodes to one of dozens, essentially.

Such an algorithm will not tell you however, "if you were prepared to go 20 
minutes later, you'd save 17 minutes total journey time and £2.80". It just 
tries to find a short graph route based on timetable data from what I can see, 
and it's far from finding an optimal route IME. 

Is it obvious I've spent a fair bit of time thinking about this? :-)

--
Paul Robinson

http://vagueware.com :: [email protected] :: +44 (0) 7740 465746

Vagueware Limited is registered in England/Wales, number 05700421
13 Crossland Road, Manchester, M21 9DU

_______________________________________________
Mailing list [email protected]
Archive, settings, or unsubscribe:
https://secure.mysociety.org/admin/lists/mailman/listinfo/developers-public

Re: [mySociety:public] Calculate optimum mid-point w ith TfL data — challenge from b3ta.com

Reply via email to