This message is more informational than anything.  I had been
struggling with performance issues using updates via just using row
operations.  This post discusses my experience solving this issue.

I'm using a Perl wrapper I have written around the base the protocol.
For those using the client libraries this may not be an issue.

I'm syncing a data source of frequently updated records with a Google
spreadsheet so that I can use the visualization library to create a
dashboard.   The data source is actually help desk tickets.  The
dashboard provides users and engineers a view into the helpdesk system
without the need to log into it.  More important this provides
protected access beyond the boundaries of our corporate intranet
without the need to VPN, as well as providing an interface on devices
like the iPhone where no client for the helpdesk system exists.

The tickets are updated frequently.  Similarly new tickets need to be
added and old tickets removed as they are closed.  Performance of the
updates is crucial if the dashboard is to be considered relevant.
The worksheet of open tickets contains about 4000 records on average,
with about 25 fields.  Performance using the row update protocol was
fine with smaller data sets but did not cut it for this heavier
demand.  The script I was use for the updates is invoked every hour -
but the update itself would sometimes take several.

It's worth noting that the updates needed to be serial.  Running
multiple instances in parallel triggered Google to identify my machine
as a bot, locking it out with a 403 error for an indeterminate amount
of time.  I was advised by Google not to hit any one worksheet with
multiple requests in parallel.

After implementing functions to support the batch cell operations, I
found that my overall performance improved dramatically for rows being
updated.  But performance was still a problem for using the row
operations to delete rows and then to add new ones.

What is essentially a no-brainer was the use of the cell operations to
simply overwrite the rows being deleted with those to be added.
Obvious, but I'm chagrined to say it did not initially occur to me.

When adding multiple rows is still necessary, rather than using the
row operations I instead update the rowCount in the worksheet meta-
data, fetch the empty rows (by appending '?return-empty=true' to the
URL), then using the batch cell operations to fill in those empty
cells.

Deleting rows is still somewhat of an issue but at the moment is
manageable.  If necessary I'll explore either clearing them with the
batch operations and then configuring the dashboard queries to ignore
these records; or perhaps look into marking those that are deleted,
sorting the worksheet, then similarly updating the rowCount to effect
their removal.

I realize that this is a somewhat non-technical post.  But I banged my
head against this for sometime.  So if it helps anyone, great!





--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google Docs Data APIs" group.
To post to this group, send email to Google-Docs-Data-APIs@googlegroups.com
To unsubscribe from this group, send email to 
google-docs-data-apis+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/Google-Docs-Data-APIs?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to