Re: [Jprogramming] File Cleanup

2018-04-04 Thread Ric Sherlock
7;t checked it with 7!:2) > > >> > > >> Henry Rich > > >> > > >> > > >> On 2/21/2018 12:08 PM, Don Guinn wrote: > > >> > > >>> Defining a verb get to retrieve the index of the desired word as > tacit > > &g

Re: [Jprogramming] File Cleanup

2018-04-04 Thread Skip Cave
t;>> performance gain. > >>> > >>> Of course, once read in words must not be modified without rebuilding > get. > >>> But if it turns out that you don't need words for anything else than in > >>> get > >>> then you could erase wor

Re: [Jprogramming] File Cleanup

2018-02-21 Thread 'Mike Day' via Programming
I didn’t do a full check on my offering. I wonder, without being able to check easily right now, whether your “nbs” is a rectangular char array rather than a boxed array, like my example, “txt”. I vaguely recall that freadb returns lines as boxes, may be wrong. In any case, I did mean t

Re: [Jprogramming] File Cleanup

2018-02-21 Thread Henry Rich
s perhaps what you are looking for R.E. Boss -Original Message- From: Programming [mailto:programming-boun...@forums.jsoftware.com] On Behalf Of Skip Cave Sent: woensdag 21 februari 2018 17:09 To: programm...@jsoftware.com Subject: Re: [Jprogramming] File Cleanup Thanks to Raul and

Re: [Jprogramming] File Cleanup

2018-02-21 Thread Skip Cave
Wow! Thanks so much for all the help on cleaning up and parsing the numberbatch text file, as well as the various methods for extracting words and their associated vectors from the data. It will take me a bit to digest all this, as well as some time to test the various suggestions, to see which sch

Re: [Jprogramming] File Cleanup

2018-02-21 Thread Raul Miller
n >>> get >>> then you could erase words after get is defined so storage used by a big >>> verb is offset by not having words around any more. >>> >>> On Wed, Feb 21, 2018 at 9:31 AM, R.E. Boss wrote: >>> >>> vec {~ (<'a

Re: [Jprogramming] File Cleanup

2018-02-21 Thread Don Guinn
;> On Wed, Feb 21, 2018 at 9:31 AM, R.E. Boss wrote: >> >> vec {~ (<'adults') i.~ words >>> is perhaps what you are looking for >>> >>> >>> R.E. Boss >>> >>> >>> -Original Message- >>>> From:

Re: [Jprogramming] File Cleanup

2018-02-21 Thread Henry Rich
kip Cave Sent: woensdag 21 februari 2018 17:09 To: programm...@jsoftware.com Subject: Re: [Jprogramming] File Cleanup Thanks to Raul and Mike for the suggestions. I read in the data: nb =: <'C:\numberbatch-en.txt' nbs =. fread nb Then I tried to clean it up: Mike's method r

Re: [Jprogramming] File Cleanup

2018-02-21 Thread Raul Miller
t; verb is offset by not having words around any more. >> > >> > On Wed, Feb 21, 2018 at 9:31 AM, R.E. Boss wrote: >> > >> >> vec {~ (<'adults') i.~ words >> >> is perhaps what you are looking for >> >> >> >> >&g

Re: [Jprogramming] File Cleanup

2018-02-21 Thread Don Guinn
>> > >> > >> R.E. Boss > >> > >> > >> > -----Original Message- > >> > From: Programming [mailto:programming-boun...@forums.jsoftware.com] > >> > On Behalf Of Skip Cave > >> > Sent: woens

Re: [Jprogramming] File Cleanup

2018-02-21 Thread Raul Miller
; >> >> > -Original Message- >> > From: Programming [mailto:programming-boun...@forums.jsoftware.com] >> > On Behalf Of Skip Cave >> > Sent: woensdag 21 februari 2018 17:09 >> > To: programm...@jsoftware.com >> > Subject: Re: [Jprogramming

Re: [Jprogramming] File Cleanup

2018-02-21 Thread Don Guinn
n Behalf Of Skip Cave > > Sent: woensdag 21 februari 2018 17:09 > > To: programm...@jsoftware.com > > Subject: Re: [Jprogramming] File Cleanup > > > > Thanks to Raul and Mike for the suggestions. > > > > I read in the data: > > > > > > nb =

Re: [Jprogramming] File Cleanup

2018-02-21 Thread Raul Miller
Yes, you need to box the name, when comparing it to the value of 'words' because the names in 'words' are all boxed. I'd do it like this: get=:3 :'vec{~words i. wrote: > Thanks to Raul and Mike for the suggestions. > > I read in the data: > > > nb =: <'C:\numberbatch-en.txt' > > nbs =. fread n

Re: [Jprogramming] File Cleanup

2018-02-21 Thread R.E. Boss
@jsoftware.com > Subject: Re: [Jprogramming] File Cleanup > > Thanks to Raul and Mike for the suggestions. > > I read in the data: > > > nb =: <'C:\numberbatch-en.txt' > > nbs =. fread nb > > > Then I tried to clean it up: > > > Mike

Re: [Jprogramming] File Cleanup

2018-02-21 Thread Don Guinn
You need to convert words to a list. Also, night use &> instead of each as It needs to be unboxed to use as an index. On Feb 21, 2018 9:09 AM, "Skip Cave" wrote: > Thanks to Raul and Mike for the suggestions. > > I read in the data: > > > nb =: <'C:\numberbatch-en.txt' > > nbs =. fread nb > > >

Re: [Jprogramming] File Cleanup

2018-02-21 Thread Skip Cave
Thanks to Raul and Mike for the suggestions. I read in the data: nb =: <'C:\numberbatch-en.txt' nbs =. fread nb Then I tried to clean it up: Mike's method ran out of memory: nbs4 =. ( i.&' ' ({.;0 ". }.)] ) every nbs |out of memory When I tried to run it on a smaller set: nbs4=: (i.&' '

Re: [Jprogramming] File Cleanup

2018-02-21 Thread Ric Sherlock
Or using the tables/dsv addon: load 'tables/dsv' dat=: makenum ' ' readdsv 'yourfile.txt' Note that although they're boxed the numbers are actually numeric. To split them you could do: labels=: {."1 dat numbers=: > }."1 dat On Wed, Feb 21, 2018 at 11:03 PM, Ric Sherlock wrote: > Another sugge

Re: [Jprogramming] File Cleanup

2018-02-21 Thread Ric Sherlock
Another suggestion using some of J's in-built utilities dat=: freads 'yourfile.txt' labels=: <@(' '&taketo);._2 dat numbers=: _ ". (' '&takeafter);._2 dat HTH Ric On Wed, Feb 21, 2018 at 9:57 PM, 'Mike Day' via Programming < programm...@jsoftware.com> wrote: > txt here is a set of lines from y

Re: [Jprogramming] File Cleanup

2018-02-21 Thread 'Mike Day' via Programming
txt here is a set of lines from your example with trailing ... removed; here are the first two:     ,.2{.txt +--+ |bell 0.0264 -0.2927 -0.0254 -0.1034 0.1672 -0.0440 -0.0019 0.1210 | +--

Re: [Jprogramming] File Cleanup

2018-02-21 Thread Raul Miller
I think you want this pair of expressions: <@({.~ i.&' ');._2 text 0 1 }. _&".;._2 text (Note that giving ". a left argument tells J that you only want numeric results. When you do this, you don't need to do the search and replace, because J recognizes - as being a minus sign. Also, since

Re: [Jprogramming] File Cleanup

2018-02-21 Thread 'Jon Hough' via Programming
a=:'belt 0.1332 0.0142 -0.1208 -0.0574 0.1451 -0.0731 -0.1293 0.0855' (}.~ #@:>@:{.@:;: ) a On Wed, 2/21/18, Skip Cave wrote: Subject: [Jprogramming] File Cleanup To: "programm...@jsoftware.com" Date: Wednesday, February 21, 2018, 5:36 PM