Larry -

I said that I'd follow up with an array implementation and I'm told that
I'm a "bottom line" kind of guy, so here it is - all 5 lines of it. 
I'll put some additional info below it -

//  Get the average price and the scaled volume
Avgprice       = ( O + H + L + C ) / 4;
//  Get the bars since the first tick
bsince         = BarsSince( Day() != Nz( Ref( Day(), -1 ) ) );
//  Get the sum of the volume for the day
Daytotvol       = Sum( V, bsince + 1 );
//  Get the vwap for each tick of the day
Dayvwap         = Sum( Avgprice * V, bsince + 1 ) / Daytotvol;
//  Get the running variance and standard deviation for each tick of the
day
//  NOTE - the use of Daytotvol outside of the Sum() function is a
subtle, but important point
Daystdev         = sqrt( Sum( ( ( Avgprice - Dayvwap ) ^ 2 ) * V, bsince
+ 1 ) / Daytotvol );

I isolated just the code that did the work so I could offer a few
comments.  I know that array logic can look daunting at first.  I've
taught it to a fair number of people.  It will help if you can visualize
each bar as a column in a spreadsheet and each variable as a row.  The
full explanation is beyond the scope of this post, but if you keep that
analogy in mind, it should help.  So, what is going on above.  The first
3 statements are straightforward.

avgprice - just a version of average price
bsince - this is an array of bars since the first bar of a day
daytotvol - this is an array of the running total of the daily volume
initialized at the start of each day

Statement 4 utilizes the variable period lookback available to the Sum()
function.  It get the running sum of the avgprice X volume for each day
and divides that result by the running total volume.  This gives a
volume weighting X average price or VWAP.

Statement 5 is the "heavy lifting".  If you unpack it, it uses the same
V/daytotvol weighting technique to derive a "running" variance.  You may
need to look at this a little while.  FWIW, it is not unusual for a
single array statement to take quite a while to analyze.

In conclusion, I thought it might be interesting to look at performance
(on a slow machine).  Your original code had a SetBarsRequired(), but
remember that AFL must first pass the code to get this, and if you are
using 100,000+ bars, this was painful.  Let's just look at the
performance or the original after AFL has analyzed the original version
and is using SetBarsRequired(), then the corrected version that I posted
earlier, and also the one above -

Original version for 5000 bars - 1.7 seconds
Correct original version for 5000 bars - 32 milliseconds
Array implementation for 5000 bars - 13 milliseconds

Now, the weighted variance makes this more complex than normal array
app's.  But, just to show off the power of what Tomasz has done with
fine-tuning array functions.  50000 bars with the array version is -

Array implementation for 50000 bars - 150 milliseconds

That is why I leave you with this thought.  Un-nested for loops are fine
for prototyping, but for production performance -

JUST SAY NO TO LOOPING :-)

BruceR




--- In [email protected], "bruce1r" <bru...@...> wrote:
>
> Larry -
>
> Good first attempt.  I thought I'd tackle this for two reasons. 
First,
> you did a fair amount of work before asking.  But, second, I had been
> about to write an article for a web site about array processing and
> happened upon your post.  If you don't mind, I'd like to use a
variation
> of it as an example for that.  There are several great teaching points
> in the performance issue that you ran into.
>
> There are two classes of issues in your AFL - an algorithm/math issue
> and a looping issue.  Let's deal with the major one first.
>
> As you might suspect the inner loop is the problem.  If you assume
> approx. 400 x 1 minute bars per day, then on average for each bar, you
> will execute the inner (j) loop 200 times.  This has to go !
>
> Here's the original code for reference -
>
> // now the hard part...calculate the variance...
> // a separate calc from the start of each day - note it requires the
> vwap from above
> // also note, we calculate starting at the first bar in the new day to
> today to the curent bar
>      Variance = 0;
>
>      for ( j = newdayindex; j < i; j++ )
>      {
>          AvgPrice = ( O[j] + H[j] + L[j] + C[j] ) / 4;
>          Variance += ( Volume[j] / totVolume ) *
>                      ( Avgprice - Vwap2temp ) * ( Avgprice - Vwap2temp
);
>      }
>
> The way to get rid of it may not be what you expect, though.  You are
> calculating a variance at each bar by restarting the calculation from
> the first bar of the day.  At first glance, it probably appeared that
> this was required because of the volume weighting.
>
> With some fairly straightforward algebraic manipulation, the formula
can
> be converted into a "running" variance calculation.  I'll just show
the
> result, and you can  work through it.  Replace the code above with -
>
>      variance = ( prevvar * prevtotvol / totvolume ) +
>                 ( Volume[i] / totvolume ) *
>                 ( Avgprice - Vwap2temp ) * ( Avgprice - Vwap2temp );
>
>      prevtotvol = totvolume;
>      prevvar = Variance;
>
> Finally, to support this code, there are two areas of initialization. 
I
> did it this way to minimize changes to your original code.  Insert the
> following code in two places -
>
> prevtotvol = 0;
> prevvar = 0;
>
> Put it once above the outer (i) loop before the for( i=0; i<Barcount;
> i++).  Then, put it in the newday initialization found in the if (
> newday[i] == True) block.
>
> This should yield a improvement of over two orders of magnitude.
> Actually, a little more could be wrung out as you have a few
> calculations that could be streamlined.  I didn't post the entire
code,
> because the endgame is really an AFL with NO LOOPS that uses purely
> array processing.  That will yield about an additional minimum 2x
> improvement at the smaller intervals (5000 bars) that you are probably
> using, but much greater as bars increase.  As importantly, it is only
7
> or 8 lines of code in total.  I'll show you that next and later.
>
> -- BruceR
>
>
> --- In [email protected], "shakerlr" ljr500@ wrote:
> >
> > I just created the following code to calculate the VWAP + std
> deviation bands, but have found that it is extrememly slow.  I posted
> the original code to the amibroker study site and was wondering if
> anyone has any suggestions to speed it up for display on 1 minute
> charts.
> >
> > Also, I noticed that if I DO NOT USE:
> > SetBarsRequired( 1000, 0 );
> >
> > The bands show up incorrect...(sometimes expanding/shrinkking as I
> scroll on the 1 minute chart)
> >
> > Note that I have about 100000 bars in my stock/ticker being
> studied...so that may be the reason it is slow...
> >
> > ----
> > /// VWAP code that also plots standard deviations...if you want a
> 3rd...it
> > should be fairly simple to add
> > //
> > // NOTE: the code is SLOOOOWWWW...can someone help speed it up?
> > // I tried my best, but can't really do much with the two
for-loops...
> > //
> > // LarryJR
> >
> >
> > SetBarsRequired( 1000, 0 );
> >
> > // this stores true/false based on a new day...
> > newday=Day() != Ref(Day(), -1);
> >
> > SumPriceVolume=0;
> > totVolume=0;
> > Vwap2=0;
> > stddev=0;
> > newdayindex=0;
> > Variance =0;
> >
> > // we must use a loop here because we need to save the vwap for each
> bar to
> > calc the variance later
> > for( i= 0; i < BarCount; i++ )
> > {
> >  // only want to reset our values at the start of a new day
> >  if (newday[i]==True)
> >  {
> >   SumPriceVolume=0;
> >   totVolume=0;
> >   newdayindex=i; // this is the index at the start of a new day
> >   Variance=0;
> >   //Vwap2=0;
> >  }
> >  AvgPrice=(O[i] + H[i] + L[i] + C[i])/4;
> >
> >  // Sum of Volume*price for each bar
> >  sumPriceVolume += AvgPrice * (Volume[i]);
> >
> >  // running total of volume each bar
> >  totVolume += (Volume[i]);
> >
> >  if (totVolume[i] >0)
> >  {
> >   Vwap2[i]=Sumpricevolume / totVolume ;
> >   Vwap2temp=Vwap2[i];
> >  }
> >
> >  // now the hard part...calculate the variance...
> >  // a separate calc from the start of each day - note it requires
the
> vwap from
> > above
> >  // also note, we calculate starting at the first bar in the new day
> to today
> > to the curent bar
> >  Variance=0;
> >  for (j=newdayindex; j < i; j++)
> >  {
> >   AvgPrice=(O[j] + H[j] + L[j] + C[j])/4;
> >   Variance += (Volume[j]/totVolume) *
> > (Avgprice-Vwap2temp)*(Avgprice-Vwap2temp);
> >  }
> >  stddev_1_pos[i]=Vwap2temp + sqrt(Variance);
> >  stddev_1_neg[i]=Vwap2temp - sqrt(Variance);
> >
> >  stddev_2_pos[i]=Vwap2temp + 2*sqrt(Variance);
> >  stddev_2_neg[i]=Vwap2temp - 2*sqrt(Variance);
> > }
> > Plot (Vwap2,"VWAP2",colorDarkGrey, styleLine);
> > Plot (stddev_1_pos,"VWAP_std+1",colorGrey50, styleDashed);
> > Plot (stddev_1_neg,"VWAP_std-1",colorGrey50, styleDashed);
> > Plot (stddev_2_pos,"VWAP_std+2",colorGrey40, styleDashed);
> > Plot (stddev_2_neg,"VWAP_std-2",colorGrey40, styleDashed);
> >
>

Reply via email to