Or using the tables/dsv addon: load 'tables/dsv' dat=: makenum ' ' readdsv 'yourfile.txt'
Note that although they're boxed the numbers are actually numeric. To split them you could do: labels=: {."1 dat numbers=: > }."1 dat On Wed, Feb 21, 2018 at 11:03 PM, Ric Sherlock <tikk...@gmail.com> wrote: > Another suggestion using some of J's in-built utilities > > dat=: freads 'yourfile.txt' > > labels=: <@(' '&taketo);._2 dat > numbers=: _ ". (' '&takeafter);._2 dat > > HTH > Ric > > On Wed, Feb 21, 2018 at 9:57 PM, 'Mike Day' via Programming < > programm...@jsoftware.com> wrote: > >> txt here is a set of lines from your example with trailing ... removed; >> here are the first two: >> ,.2{.txt >> +----------------------------------------------------------------------+ >> |bell 0.0264 -0.2927 -0.0254 -0.1034 0.1672 -0.0440 -0.0019 0.1210 | >> +----------------------------------------------------------------------+ >> |bell_tower -0.1252 -0.1233 0.1351 0.1897 0.0242 0.0014 0.1942 -0.0237 | >> +----------------------------------------------------------------------+ >> >> This separates words from numerica vectors of arbitrary length: >> ( i.&' ' ({.;0 ". }.)] ) every txt >> +----------+-----------------------------------------------------------+ >> |bell |0.0264 _0.2927 _0.0254 _0.1034 0.1672 _0.044 _0.0019 0.121 | >> +----------+-----------------------------------------------------------+ >> |bell_tower|_0.1252 _0.1233 0.1351 0.1897 0.0242 0.0014 0.1942 _0.0237 | >> +----------+-----------------------------------------------------------+ >> |belt |0.1332 0.0142 _0.1208 _0.0574 0.1451 _0.0731 _0.1293 0.0855| >> +----------+-----------------------------------------------------------+ >> |belfast |0.119 _0.044 _0.0254 _0.209 0.2144 0.0348 _0.1467 0.1256 | >> +----------+-----------------------------------------------------------+ >> >> It should be easy enough to split off the first column as a word-list, >> and the second as a vector of vectors. >> >> OK? >> >> Mike >> >> >> >> >> >> >> >> On 21/02/2018 08:36, Skip Cave wrote: >> >>> I read in a text file of word vectors using fread. The format looks like >>> this: >>> >>> bell 0.0264 -0.2927 -0.0254 -0.1034 0.1672 -0.0440 -0.0019 0.1210 ... >>> >>> bell_tower -0.1252 -0.1233 0.1351 0.1897 0.0242 0.0014 0.1942 -0.0237 ... >>> >>> belt 0.1332 0.0142 -0.1208 -0.0574 0.1451 -0.0731 -0.1293 0.0855 ... >>> >>> belfast 0.1190 -0.0440 -0.0254 -0.2090 0.2144 0.0348 -0.1467 0.1256 ... >>> >>> Everything is literal text. >>> >>> The basic layout for each line is: >>> >>> word(s) (could contain multiple words separated by underscores) >>> space >>> number (positive or negative) in text format >>> space >>> number (positive or negative) in text format >>> space >>> ...... repeat for 300 numbers (in text) >>> >>> the last number is followed by a line feed for the next line >>> >>> I need to: >>> 1. Convert all the the high minuses (-) to J's low minus (_) >>> 2. Extract the word(s) up to the first space into a separate array >>> (words) >>> 3. Convert the text numbers into a 2D array of ? x 300 floating point >>> numbers >>> >>> I know how to do #1 (string replace), and #3 (".) once I get rid of the >>> words, >>> but I don't know how to strip out the initial word on each line and put >>> them in a separate array. Any help is appreciated. >>> >>> Skip >>> ---------------------------------------------------------------------- >>> For information about J forums see http://www.jsoftware.com/forums.htm >>> >> >> >> --- >> This email has been checked for viruses by Avast antivirus software. >> https://www.avast.com/antivirus >> >> >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm >> > > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm