Hi,

I'm trying figure out how to load a huge file that contains some 800k pair
of integers (two integers per line) which represent edges of a directed
graph.

So if the ith line has x and y, it means that there is an edge between x
and y vertex in the graph.

The goal is to load it in an array of arrays representation, where the kth
array contains all the nodes, where there is a directed edge from the kth
node to those nodes.

I've attempted multiple variants of with-open reader and line-seq etc. but
almost always ended up with OutMemoryException or sg VERY slow.

My latest attempt that also does not work on the large input:

(defn load-graph [input-f]
  (with-open [rdr (io/reader input-f)]
    (->> (line-seq rdr)
        (map (fn [row]
               (let [[v1str v2str] (str/split row #"\s")]
                   [ (Integer/parseInt v1str) (Integer/parseInt v2str) ]))
  )
        (reduce (fn [G [v1 v2]]
                  (if-let [vs (get G v1)]
                    (update-in G [v1] #(conj % v2))
                    (assoc G v1 [v2])))  { }  ))))

I'm getting a bit frustrated as there are Python, Go implementations that
load the graph in less the 5 seconds.

What am I doing wrong?

Thanks

-- 
László Török

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to