a=:'belt 0.1332 0.0142 -0.1208 -0.0574 0.1451 -0.0731 -0.1293 0.0855'
(}.~ #@:>@:{.@:;: ) a
--------------------------------------------
On Wed, 2/21/18, Skip Cave <[email protected]> wrote:
Subject: [Jprogramming] File Cleanup
To: "[email protected]" <[email protected]>
Date: Wednesday, February 21, 2018, 5:36 PM
I read in a text file of word vectors using
fread. The format looks like
this:
bell 0.0264 -0.2927 -0.0254 -0.1034
0.1672 -0.0440 -0.0019 0.1210 ...
bell_tower -0.1252 -0.1233 0.1351
0.1897 0.0242 0.0014 0.1942 -0.0237 ...
belt 0.1332 0.0142 -0.1208 -0.0574
0.1451 -0.0731 -0.1293 0.0855 ...
belfast 0.1190 -0.0440 -0.0254 -0.2090
0.2144 0.0348 -0.1467 0.1256 ...
Everything is literal text.
The basic layout for each line is:
word(s) (could contain multiple words
separated by underscores)
space
number (positive or negative) in text
format
space
number (positive or negative) in text
format
space
...... repeat for 300 numbers
(in text)
the last number is followed by a line
feed for the next line
I need to:
1. Convert all the the high minuses (-)
to J's low minus (_)
2. Extract the word(s) up to the first
space into a separate array (words)
3. Convert the text numbers into a 2D
array of ? x 300 floating point
numbers
I know how to do #1 (string replace),
and #3 (".) once I get rid of the
words,
but I don't know how to strip out the
initial word on each line and put
them in a separate array. Any help is
appreciated.
Skip
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm