Programming challenge: Re: [cs-lisp] Re: PHP+MySQL versus Lisp: Shortest Path problemi ile ilgili

Chris Stephenson Sat, 17 Dec 2005 07:44:46 -0800

1.

OK we misunderstood the input format - it is neighbour lists, not pairsof contacts, Emre's example test data misled us.

2.

My challenge to my students still stands (modified in the light of ournew information).

a) Modify the Graham code to work on pairs of contacts

b) Find the -algorithmic- inefficiency in Graham's code and correct it.My example (misunderstood) test data gives a clue to the algorithmicinefficiency.c) Solve the implementation inefficiencies in Graham's code to get downto a solution that is O(e+v) or, at the very worst, O( (e+v) log (e+v))


I once again attach the scheme version of the Graham code.

3.
Technicalities:-
There are two inefficiencies in Graham's code

(a) the algorithmic inefficiency that is the subject of challenge 2(b).

(b) the use of inherently time inefficient LISP structures - this isimportant, but -much- less important than (a). LISP/Scheme, of course,offers alternative, more efficient, structures, or you can easily andcheaply build your own.


4.
Cember's problem:

a) I repeat that this is a problem whose solution is LINEAR in the totalsize of its input. It is therefore easily soluble, in Cember's practicalsituation, without blowing up the server, even for tens of thousands ofsubscribers. You can make that hundreds of thousands of subscribers,given the likely e/v ratio (quite low) and the likely pattern ofcontacts. O(e+v) is for finding ALL shortest paths starting at a givenode. We only want one.b) IF the Cember SQL solution has the same -algorithmic- inefficiency asGraham's solution that will explain the explosions.c) Those on the mail lists who are proposing cacheing are missing thepoint. If we keep the neighbour lists in primary memory, this problem istrivial. If there is a performance problem, do -not- cache the shortestpaths (it is a list of size O(n^3)!) Cache the neighbour lists!! We knowwhen we update the neighbour lists in the database, so we can keep themin a simple write through cache.d) Language choice is not important. Whatever language the existingsystem is written in, use that. PHP gives extensible arrays, efficientlyimplemented as hash tables, and easy interface to the database. Giventhat you then only have to write seven lines of code, even I would beprepared to (in fact prefer to!) write it in PHP: then wash my hands, ofcourse. Writing this particular algorithm efficiently in LISP/Schemerequires doing some unlispy things or writing some complicated code. Whybother with the interfacing problems for no significant gain? Actuallyyou are exaggerating the interfacing problems.

Emre, put Cember in touch with me. I will solve their problem. It istrivial.


After 0900 on Wednesday I will publish my solutions.

Sorry, Can Burak, shortest path is an -easy- problem!

CS




Can Burak Cilingir wrote:

[ ... ]
What you say is: Once your server is crunched and cached
the results of those queries, ok, it won't crash if the
same queries are made. But of course, each time brand new
queries with different Contacts are generated.

Could I make myself clear this time?
I was already clear on that but trying to implement caching somehow.My question for the previous mail was not "is caching can handle theload or not". It was the point of my first mail (do that once a day)and understood that is not possible.
so my question is is there a better way to cache this? (see myparagragh below "Maybe we need a trade-off here")
 >function shortest-path (membera memberb)
 >{
 >if ispathcached (membera memberb)

Probably not.
You are absolutely right. When I browse such sites, I rarely click onpeople whom I already know.
 >     p = getcachedpath (membera memberb)
 >     t = getcachedtime (membera memberb)

 >//is cache still valid?
 >for each member of p as m
 >     mt = getmodificationtimeofconnlist(m)
 >     if (mt > t)
 >     {
 >         np = regeneratecache(membera memberb)
 >         return np
 >     }

And you imagine connections are rock hard?
Maybe our good old acquaintance has just left the
network. I'm making the same query, you and me
but the network data has changed. So you have
to modify your cache. That means recalculating.
What an acquaintance! Anyway, again you had to
calculate. Lots of calculations, people are
clicking, think of 10.000 people network, a few
thousand online, every minute a few 10 people
are coming, partially connected and making queries
which are not cached yet.

Maybe we need a trade-off here.
we can cache some intermediate paths and somehow calculate "the not soshortest path". a -> c is (a g j o d c) for a->d, you can assume (a gj o d) although there may be a path such that (a w d) if you need aconnection, we have that. but if you need "the" shortest path this isnot the answer. I think this is better than showing nothing.
We are also talking about "shortest path" which is not an easy tosolve problem. and doing this over and over again. I am curious howorkut handles that load (with the help of google's cluster? :)).
Emre S.

(define assoc-null
  (lambda (q alist)
    (let ((answer (assoc q alist)))
    (cond
      ((not answer) null)
      (else answer)))))

(define cdr-null
  (lambda (x)
    (cond
      ((null? x) null)
      (else (cdr x)))))

(define shortest-path 
     (lambda (start end net)
       (bfs end (list (list start)) net)))

(define bfs 
     (lambda (end queue net)
       (if (null? queue)
           null
           (let ((path (car queue)))
             (let ((node (car path)))
               (if (eq? node end)
                   (reverse path)
                   (bfs end
                        (append (cdr queue)
                                (new-paths path node net))
                        net)))))))
     
(define new-paths 
  (lambda (path node net)
    (map (lambda (n)
                (cons n path))
            (cdr-null (assoc-null node net)))))

_______________________________________________
cs-lisp mailing list
cs-lisp@cs.bilgi.edu.tr
http://church.cs.bilgi.edu.tr/lcg
http://cs.bilgi.edu.tr/mailman/listinfo/cs-lisp

Programming challenge: Re: [cs-lisp] Re: PHP+MySQL versus Lisp: Shortest Path problemi ile ilgili

Cevap