RE: [Jprogramming] Efficiently converting fixed-width to delimited

R.E. Boss Mon, 05 Feb 2007 11:07:19 -0800

If I understand your coding right, 
  l is indicating the characters per field (or until the next LF),
succeeding (all) the blanks at the beginning, whereas 
  t indicates the characters per field (or until the next LF), preceding
(all) the blanks at the end.
So with (l*.t)#y you loose the extreme spaces.


I thought about a sequential machine doing the same and created

NB. +---+-+----+-+-+-+-+------------------------+
NB. |' '| |x;LF| |Z| | |state                   |
NB. +---+-+----+-+-+-+-+------------------------+
NB. |1  |1|3   |1|4|1|0|initial                 |
NB. +---+-+----+-+-+-+-+------------------------+
NB. |1  |0|3   |1|4|1|1|begin blanks            |
NB. +---+-+----+-+-+-+-+------------------------+
NB. |2  |0|3   |1|4|2|2|other consecutive blanks|
NB. +---+-+----+-+-+-+-+------------------------+
NB. |1  |2|3   |2|4|2|3|field seperator x, or LF|
NB. +---+-+----+-+-+-+-+------------------------+
NB. |2  |2|3   |2|4|2|4|no space, not x nor LF  |
NB. +---+-+----+-+-+-+-+------------------------+

m=:<:+/2 1*(<a.)[EMAIL PROTECTED]&>' ';'+',LF   NB. ' ' to 0, x;LF to 1, rest 
to 2

S=: _2]\"1 }.".;._2 (0 : 0) 
''' ''   x;LF    Z ']0
1       1       3       1       4       1
1       0       3       1       4       1
2       0       3       1       4       2
1       2       3       2       4       2
2       2       3       2       4       2
)

   '+' fw2dl ' + 4.31:A  +-   :H +  -:H+4.30:A+4.32:A+4.25:A+4.25',LF
+4.31:A+-   :H+-:H+4.30:A+4.32:A+4.25:A+4.25

   (1;S;m);: ' + 4.31:A  +-   :H +  -:H+4.30:A+4.32:A+4.25:A+4.25',LF
+4.31:A+-   :H+-:H+4.30:A+4.32:A+4.25:A+4.25

For  data=: LF,~"1 [5000 1000$' +  4.31:  A +-:H   +-
:H+4.30:A+4.32:A+4.25:A+4.25 '
the performance figures are:
   
   ts' ''+'' fw2dl"1 data '
0.28420038 21950912
   ts' (1;S;m);: "1 data '
0.11558694 27071040
   ts' ''+'' fw2dl ,data '
0.2271174 58722560
   ts' (1;S;m);:  ,data '
0.090205502 16778688

so a factor 2.5

    ((1;S;m)&;: -: '+'&fw2dl) ,data 
1
    ((1;S;m)&;:"1 -: '+'&fw2dl"1) data 
1


R.E. Boss


-----Oorspronkelijk bericht-----
Van: Dan Bron [mailto:[EMAIL PROTECTED] 
Verzonden: maandag 5 februari 2007 16:08
Aan: [EMAIL PROTECTED]
CC: Programming Forum
Onderwerp: RE: [Jprogramming] Efficiently converting fixed-width to
delimited

R.E. Boss wrote:

>As I see it, you amend d by 1(0)}d but you don't use d anymore (in fw2dl),
>is that correct?

Nope, it's not correct. You found a bug! 

The line after  d =. 1 (0)} d  should have used  d  as a left hand argument.
So the complete function should have been:

        fw2dl =: verb define
         
          TAB fw2dl y
        
        :
                s =. y ~: ' '
                d =. y e. LF,x
        
                l =. d ([: ; <@(+./\ );.2) s
        
                d =. 1 (0)} d  
NB.             t =. s ([: ; <@(+./\.);.1) s   NB.  <-- Buggy version
                t =. d ([: ; <@(+./\.);.1) s 
                
                e =. l*.t
                y =. e#y
        
        )

Thanks for noticing this and pointing it out!  This was in production code!
I'm actually surprised the script didn't break. 

I guess next time I'll choose variable names seperated by more one key.
Brevity has its drawbacks, too.

-Dan


----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

RE: [Jprogramming] Efficiently converting fixed-width to delimited

Reply via email to