[Haskell-cafe] Re: Longest increasing subsequence

ChrisK Fri, 11 Apr 2008 12:34:02 -0700

My late night suggestions were nearly correct. I have actually written the codenow. Once keeping track of indices, and a second time without them:

{-# LANGUAGE BangPatterns #-}
-- By Chris Kuklewicz, copyright 2008, BSD3 license
-- Longest increasing subsequence
-- (see http://en.wikipedia.org/wiki/Longest_increasing_subsequence)
import Data.List (foldl')
import Data.Map (Map)
import qualified Data.Map as M (empty,null,insert,findMin,findMax
                               ,splitLookup,deleteMin,delete)


type DList a = [a] -> [a]

lnds :: Ord a => [a] -> [a]
lnds = lnds_decode . lnds_fold

lnds_fold :: Ord a => [a] -> Map a (DList a)
lnds_fold = foldl' process M.empty where
  -- The Map keys, in sorted order, are the input values which
  --   terminate the longest increasing chains of length 1,2,3,…
  process mu x =
    case M.splitLookup x mu of
      (_,Just {},_) -> mu -- ignore x when it is already an end of a chain

      (map1,Nothing,map2) | M.null map2 ->
        -- insert new maximum element x
        if M.null mu
          then M.insert x (x:) mu -- x is very first element
          else let !xs = snd (M.findMax mu)
               in M.insert x (xs . (x:)) mu

                          | M.null map1 ->
        -- replace minimum element with smaller x
        M.insert x (x:) (M.deleteMin mu)

                          | otherwise ->
        -- replace previous element oldX with slightly smaller x
        let !xs = snd (M.findMax map1)
            !oldX = fst (M.findMin map2) -- slightly bigger key
            !withoutOldX = M.delete oldX mu
        in M.insert x (xs . (x:)) withoutOldX

lnds_decode :: Ord a => Map a (DList a) -> [a]
lnds_decode mu | M.null mu = []
               | otherwise = snd (M.findMax mu) []

tests =  [ ['b'..'m'] == (lnds $ ['m'..'s'] ++ ['b'..'g'] ++ ['a'..'c'] ++ 
['h'..'k'] ++ ['h'..'m'] ++ ['d','c'..'a'])
         , "" == lnds ""
         , "a" == lnds "a"
         , "a" == lnds "ba"
         , "ab" == lnds "ab"
         ]


Comparing to wikipedia:

The X[M[1]],X[M[2]],… sequence is strictly increasing. These are the ends ofthe current increasing chains of length 1,2,… and they are the keys to the Mapin my code.

The values of the map are the subsequences themselves, in DList form.Instead of pointing to the index of the previous element I just lookup '!xs' andappend '(x:)' to that.


Complexity:

The strictness annotations ensure that the garbage collector can destroy anyunreachable DList entries. The space usage is thus O(N) and may be O(1) forcertain inputs (such as the best case of never-increasing input list). Astrictly increasing input list is the worst case for space usage.

The naive time complexity of 'process' for the i'th input value is O(log i).This can be double checked by looking at the time complexity of everything Iimport from Data.Map.


Peak performance could be had by

(1) adding the first element before the foldl' to avoid checking for thiscase in process(2a) accessing the internal map structure to optimize thesplitLookup->delete->insert case into a single operation(2b) Using something like a zipper to access the to-be-deleted-and-replacedelement of the mapThe (2a) and (2b) work because we know the changed key will go into the same'slot' of the map as the old one.


--
Chris

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: Longest increasing subsequence

Reply via email to