Chouffe commented on a change in pull request #15340: [Clojure] Add fastText 
example
URL: https://github.com/apache/incubator-mxnet/pull/15340#discussion_r297295285
 
 

 ##########
 File path: 
contrib/clojure-package/examples/cnn-text-classification/src/cnn_text_classification/data_helper.clj
 ##########
 @@ -90,15 +92,21 @@
   ([path embedding-size]
    (load-word2vec-model path embedding-size {:max-vectors 100})))
 
-(defn read-text-embedding-pairs [rdr]
-  (for [^String line (line-seq rdr)
+(defn read-text-embedding-pairs [pairs]
+  (for [^String line pairs
         :let [fields (.split line " ")]]
     [(aget fields 0)
      (mapv #(Float/parseFloat ^String %) (rest fields))]))
 
 (defn load-glove [glove-file-path]
   (println "Loading the glove pre-trained word embeddings from " 
glove-file-path)
-  (into {} (read-text-embedding-pairs (io/reader glove-file-path))))
+  (into {} (read-text-embedding-pairs (line-seq (io/reader glove-file-path)))))
+
+(def remove-fasttext-metadata rest)
+
+(defn load-fasttext [fasttext-file-path]
+  (println "Loading the fastText pre-trained word embeddings from " 
fasttext-file-path)
+  (into {} (read-text-embedding-pairs (remove-fasttext-metadata (line-seq 
(io/reader fasttext-file-path))))))
 
 Review comment:
   Can we also use a threading macro here for readability?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to