Re: [Jprogramming] performance : J or C ?

Robert Cyr Tue, 07 Oct 2008 20:20:14 -0700

Of course my intent was not to challenge grep or the numerous searching
utilities at our disposal.  As far as grep is concerned, even in this
trivial case I find J easier to use and gave up on grep after a few tests.


The intent of the note to the forum was simply to point out that if properly
used, J can be quite fast and going to C is normally irrelevant and rarely
worth the trouble.  In the example provided, C has a small speed advantage,
but my tests suggest that with a larger number of selected IP adresses, J
actualy gains on C.  The linux sample code is well written, but did not
benefit from the extensive testing and profiling often required in a
comercial application.  In this case, profiling might have indicated that
using realloc() is  a bit time consuming.

In fact, I used the wiki format suggested by others as the only way to
submit C code, considering that attached documents are not permitted on this
forum.





On Tue, Oct 7, 2008 at 12:52 AM, Oleg Kobchenko <[EMAIL PROTECTED]> wrote:

> Is this C code portable? Will it run on Window and Mac?
> You need to compile C sources for each supported platform,
> etc, etc.
>
> Also interesting, what is time time comparison of these
> C or J solutions with grep? Since it's a trivial task.
>
>
> > From: Robert Cyr <[EMAIL PROTECTED]>
> >
> > Roger Hui wrote:
> > >I would be interested to see the C routine.  J Forum msgs
> > >do not seem to permit attachments, so you may have to
> > >post the code in a J wiki page.
> >
> > I may not have write permission in J wiki pages.  It may be of interest
> to
> > some that:
> >
> >    - J code extracts its IPs in 5 lines. I read un a bit on mapped files
> and
> >    ti took about 1/2 hour to write the code.  It executes in 0.67 seconds
> after
> >    loading J.
> >    - C code was writen as an exercise in 6-8 hours including a revision.
>  It
> >    was cleanly written in 150 lines and  executes in 0.35 secondss.
> >
> > The author of this C code had studied languages as ML before, but my
> > demonstration did not persuade him to give J a try.
> >
> >
> > On Fri, Oct 3, 2008 at 6:41 PM, Dan Bron wrote:
> >
> > > Robert wrote:
> > > > The difference is not dramatic
> > >
> > > Interesting, it was in my tests.  However, the speedup is more a
> function
> > > of the number of IPs (i.e. matches found by  'rhost=
> > > '&E.  ) than the total size of the file, and obviously my test data
> differs
> > > from yours.
> > >
> > > > I need to ignore all characters from the first blank
> > > > on each IP address, in case the IP is shorter than 14
> > > > characters and is followed by something else.
> > >
> > > Ah, of course.  Didn't think of that.  Given this constraint, I believe
> the
> > > majority of the time is going to be spent in trimming
> > > and reassembling this array.  (Again, provided your list of IPs is
> large.)
> > >
> > > One potential optimization would avoid trimming the large list of IPs.
> > >  Take the nub of the "dirty list", then trim, then nub
> > > again.  Trimming and reassembling this (presumably) smaller array might
> > > save you more than the cost of the second nub.
> > >
> > > Another potential optimization (which again improves with the size of
> the
> > > IP list) is to avoid manifesting the entire addition
> > > table.  Instead of
> > >
> > >        ip        =:  ] {~ I.@:E. +/ '255.255.255.255' (+i.)&#~ [
> > >
> > > you could try
> > >
> > >        ip        =:  ] ];.0~ (#'255.255.255.255') ,:"0/~ #@:[ + I.@:E.
> > >
> > > I haven't tested any of these proposed changes, so they may be broken,
> > > buggy, or even slower than the current solution.  And
> > > improvements, if any, are likely to be even less dramatic than the
> first
> > > round.
> > >
> > > Also, I believe your solution currently completes in less than a
> second,
> > > which is not such a burden.  So it might not be worth
> > > rewriting, particularly since optimized code tends to be less familiar
> > > (hence, less readable/maintainable).
> > >
> > > The real value of J in this instance is that it's (A) [more than]
> > > competitive with the equivalent C code and (B) it was much
> > > faster to write (correct?) and very likely easier to change.  The more
> time
> > > spent optimizing, the smaller the benefit of (B).
> > >
> > > >  to be as efficient as possible in J, then tacit is the way to go
> > >
> > > Since one of my two optimizations was a non-starter, the entire speedup
> can
> > > be attributed to the replacement of  I. x E. y  with
> > > x  I.@:E.  y  .  That construct is supported by special code, and
> happens
> > > to be tacit.  So in this case the efficiencies were a
> > > result of using special code, rather than tacit code, per se.
> > >
> > > -Dan
> > >
> > > ----------------------------------------------------------------------
> > > For information about J forums see http://www.jsoftware.com/forums.htm
> > >
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
>
>
>
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] performance : J or C ?

Reply via email to