Re: [Jprogramming] File Cleanup

Ric Sherlock Wed, 21 Feb 2018 02:23:20 -0800

Or using the tables/dsv addon:

load 'tables/dsv'
dat=: makenum ' ' readdsv 'yourfile.txt'


Note that although they're boxed the numbers are actually numeric.
To split them you could do:
labels=: {."1 dat
numbers=: > }."1 dat


On Wed, Feb 21, 2018 at 11:03 PM, Ric Sherlock <[email protected]> wrote:

> Another suggestion using some of J's in-built utilities
>
> dat=: freads 'yourfile.txt'
>
> labels=: <@(' '&taketo);._2 dat
> numbers=: _ ". (' '&takeafter);._2 dat
>
> HTH
> Ric
>
> On Wed, Feb 21, 2018 at 9:57 PM, 'Mike Day' via Programming <
> [email protected]> wrote:
>
>> txt here is a set of lines from your example with trailing ... removed;
>> here are the first two:
>>     ,.2{.txt
>> +----------------------------------------------------------------------+
>> |bell 0.0264 -0.2927 -0.0254 -0.1034 0.1672 -0.0440 -0.0019 0.1210     |
>> +----------------------------------------------------------------------+
>> |bell_tower -0.1252 -0.1233 0.1351 0.1897 0.0242 0.0014 0.1942 -0.0237 |
>> +----------------------------------------------------------------------+
>>
>> This separates words from numerica vectors of arbitrary length:
>>    ( i.&' ' ({.;0 ". }.)] ) every txt
>> +----------+-----------------------------------------------------------+
>> |bell      |0.0264 _0.2927 _0.0254 _0.1034 0.1672 _0.044 _0.0019 0.121 |
>> +----------+-----------------------------------------------------------+
>> |bell_tower|_0.1252 _0.1233 0.1351 0.1897 0.0242 0.0014 0.1942 _0.0237 |
>> +----------+-----------------------------------------------------------+
>> |belt      |0.1332 0.0142 _0.1208 _0.0574 0.1451 _0.0731 _0.1293 0.0855|
>> +----------+-----------------------------------------------------------+
>> |belfast   |0.119 _0.044 _0.0254 _0.209 0.2144 0.0348 _0.1467 0.1256   |
>> +----------+-----------------------------------------------------------+
>>
>> It should be easy enough to split off the first column as a word-list,
>> and the second as a vector of vectors.
>>
>> OK?
>>
>> Mike
>>
>>
>>
>>
>>
>>
>>
>> On 21/02/2018 08:36, Skip Cave wrote:
>>
>>> I read in a text file of word vectors using fread. The format looks like
>>> this:
>>>
>>> bell 0.0264 -0.2927 -0.0254 -0.1034 0.1672 -0.0440 -0.0019 0.1210 ...
>>>
>>> bell_tower -0.1252 -0.1233 0.1351 0.1897 0.0242 0.0014 0.1942 -0.0237 ...
>>>
>>> belt 0.1332 0.0142 -0.1208 -0.0574 0.1451 -0.0731 -0.1293 0.0855 ...
>>>
>>> belfast 0.1190 -0.0440 -0.0254 -0.2090 0.2144 0.0348 -0.1467 0.1256 ...
>>>
>>> Everything is literal text.
>>>
>>> The basic layout for each line is:
>>>
>>> word(s) (could contain multiple words separated by underscores)
>>> space
>>> number (positive or negative) in text format
>>> space
>>> number (positive or negative) in text format
>>> space
>>> ......   repeat for 300 numbers (in text)
>>>
>>> the last number is followed by a line feed for the next line
>>>
>>> I need to:
>>> 1. Convert all the the high minuses (-) to J's low minus (_)
>>> 2. Extract the word(s) up to the first space into a separate array
>>> (words)
>>> 3. Convert the text numbers into a 2D array of ? x 300 floating point
>>> numbers
>>>
>>> I know how to do #1 (string replace), and #3 (".) once I get rid of the
>>> words,
>>> but I don't know how to strip out the initial word on each line and put
>>> them in a separate array. Any help is appreciated.
>>>
>>> Skip
>>> ----------------------------------------------------------------------
>>> For information about J forums see http://www.jsoftware.com/forums.htm
>>>
>>
>>
>> ---
>> This email has been checked for viruses by Avast antivirus software.
>> https://www.avast.com/antivirus
>>
>>
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
>
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] File Cleanup

Reply via email to