Re: [abcusers] thinking in the large

John Chambers Tue, 27 Apr 2004 10:48:29 -0700

Jack Campin writes:
| So operations on large-scale ABC databases are likely to become more
| important - things like:
|
| - storing the tunes in a database, parsing and indexing them at entry
|   time
|
| - distributed versioning, so that an ABC creator could get forwarding
|   pointers or editing commands inserted into superseded copies of tunes
|
| - plagiarism search (assume the entire Harry Fox Agency database)
|
| - mending corrupt tunes by finding better versions of garbled parts
|
| - collating information from all copies of a tune, so you could unify
|   the discography from one copy with the detailed formatting code from
|   another
|
| - indexing the corpus of tunes by harmonic progression, as calculated
|   by something like ABCMus's auto-harmonizer, and allowing fuzzy
|   search
|
| All of that would be easier if persistent parse trees were available.
|
| It's a pity the Tune Finder doesn't yet have options to download
| everything it knows about or synchronize your own mirror with it.
| In the long run this might be *less* resource-intensive than what
| it's presently doing, as complete downloads could be offloaded onto
| mirror sites and intelligent synchronization of updates doesn't need
| to be any more expensive than search.


Actually, doing that would be not just feasible; it  would  be  easy,
except  for that one little elephant hiding over there in the corner:
Copyright.

My search bot obviously does download every file that it  scans.   It
normally throws them away. But it has a flag saying to cache any file
that contains a tune.  I use that occasionally when I'm  testing  new
ideas,  so that I don't have to repeatedly hit some poor unsuspecting
server for a file.  It really speeds up testing to have  a  few  good
test  files  on  the  local  disk.  It's set up so that all my search
program has to do to use the cached version of a file is  to  replace
a URL's "://" with "/", and the result is the cached file.

However, I've never told people where to find the cache.  Most of the
time it's empty, and you won't find anything there. This is because I
don't have permission to "mirror" everyone's files.  I have the space
available,  but  I'm not at all sure I'd even want to try negotiating
permissions with the 280 sites that the search bot knows about.

A scan just finished early this morning, and the cache is full at the
moment.  So if the above is sufficient clue, interested parties could
find it.  It'll probably even stay around for a few days, since  I've
been doing some experimenting with some ideas. But it could vanish at
any time.

It is somewhat a pity that the current copyright laws do so effective
a  job  of  blocking useful and innovative ideas like those in Jack's
list.  Maybe everyone should be getting together lists of such  ideas
and  hitting  their  politicians for changes in the laws to encourage
development.  A country that legalizes such innovation  could  likely
become a center of development.

If I could be assured that I wouldn't be prosecuted  or  banned  from
using  the  Internet  for  doing  so,  I could easily make my cache a
permanent part of my collection. I'd just not delete it. Then I could
try  writing  a web page that lets you combine them, or extract tunes
from some of them into a new file, or whatever.  But the  way  things
are  going  these  days,  attempting something like this could easily
produce some rather huge fines.

(I've recently been wondering if it might be time to  start  learning
Mandarin.   And  there's  some  wonderful music from that part of the
world. ;-)

(And there's the ongoing problem of the variety of  what  passes  for
ABC on the Net. I really wish we could get people to stop burying ABC
inside HTML.  That's a real nightmare for a  programmer.   It's  much
worse  than the minor differences in ABC dialects.  I'd probably just
have to ban such tunes from any software that tries to combine things
from  different sources.  Or maybe I'll find the time to write a good
DeHTMLizer ...)

To subscribe/unsubscribe, point your browser to: http://www.tullochgorm.com/lists.html

Re: [abcusers] thinking in the large

Reply via email to