Re: [sqlite] Compare and INSERT INTO syntax issue

2013-09-23 Thread Keith Medcalf
Omit the values keyword since you do not have any values (you are using a 
select instead).

 INSERT OR REPLACE INTO SONGS (skipcount) 
 SELECT substr( custom5, 1, 4) 
   FROM SONGS 
  WHERE custom5 > '%' 
and skipcount < cast(substr(custom5, 1, 4) as int);

the Values clause introduces a constant row to insert, as in insert into x 
(a,b,c,d) values (1,2,3,4);

> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Kristopher Roy
> Sent: Sunday, 22 September, 2013 16:31
> To: sqlite-users@sqlite.org
> Subject: [sqlite] Compare and INSERT INTO syntax issue
> 
> I am trying to grab the value of custom5 % and compare it to
> skipcount
> and then if it is a higher value write it to skipcount, but something is
> wrong with my syntax, can anyone help?
> 
> INSERT OR REPLACE INTO SONGS (skipcount) Values SELECT substr( custom5,
> 1,
> 4) FROM SONGS WHERE custom5 > '%' and skipcount < cast( substr(
> custom5, 1, 4) as int );
> 
> --
> Thank You,
> Kristopher C. Roy
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] EncFs + sqlite3

2013-09-23 Thread Clemens Ladisch
Paolo Bolzoni wrote:
> What do you mean with "if you do not ATTTACH databases"?

EncFS does not implement .fsyncdir, which is used by SQLite when
deleting the master journal file.  Such a file is used for transactions
when there are multiple database files, i.e., when you have used ATTACH.

But I overlooked another case where SQLite uses fsync on a directory,
which is when it creates any journal or WAL file.  So, regardless of
having used ATTACH or not, your data and transactions are safe in case
of a crash only if you are using a journaling file system below EncFS.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Problem with method numRows() in Sqlite3

2013-09-23 Thread Tim Streater
On 23 Sep 2013 at 04:50, pisey phon  wrote: 

> I got an error "Call to undefined method SQLite3Result::numRows()". and here
> is my code:
> $db = new Sqlite3("sample.db");
> $result = $db->query("select * from table");
> $rows = $result->numRows();

Sorry, ignore my last mail. DO this:

$resl = $db->query ("select * from table");
$regl = $resl->fetchAll (PDO::FETCH_ASSOC);
$numl = count ($regl);  // $numl has number of rows


--
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] question on WAL files

2013-09-23 Thread Dan Kennedy

On 09/20/2013 11:30 PM, Carey, John wrote:

Hello,

I am trying to write a simple perl program (perl version 5.16.3). I am 
accessing a SQLite db which is using WAL(write ahead logging). It took me a 
while but I learned that my connection is only looking at the DB file and not 
the WAL file. As a result, my queries are not returning the full data set.

the SQLite firefox addon shows 10 invoices in the invoice table. the following 
program is returning only 8...

use DBD::SQLite;
use DBI;
use Tkx;
use File::Copy;
use PDF::API2;
use constant mm => 25.4 / 72;
use constant in => 1 / 72;
use constant pt => 1;

# Prompt user for database file location

$dbfile = Tkx::tk___getOpenFile(-initialdir=>'C:/events_p', -title=>'Please 
select your current events database file (event.sqlite)');
$dbfile =~ s/\//\\/g;

# Initialiatize path variables
# parse out the db filename and path
$DbFilename=substr($dbfile,length($dbfile)-12,12);
$FolderPath=substr($dbfile,0,length($dbfile)-13);
$OriginalImages_Path=$FolderPath.\\OriginalImages;
$Internal_Path=$FolderPath.\\_internal;
$Carts_Path=$FolderPath.\\Carts;
$Orders_Path=$FolderPath.\\Orders;
$favorites="c:\favorites";
$Invoices_Folder="C:\\events_p_invoices";

# Error and exit if incorrect DB filename was selected
if ($DbFilename ne 'event.sqlite')
{
 print "\n ERROR , Invalid Database Filename Selected\n";
 print "  You MUST select the filename called  event.sqlite\n";
 sleep(10);
 exit;
}

# Connect to the Database
my $dbh2 = DBI->connect("dbi:SQLite:$dbfile", "", "",{RaiseError => 1, AutoCommit 
=> 1}) or die $DBI::errstr;;
$dbh2->do('PRAGMA journal_mode=WAL');
my $sth = $dbh2->prepare("SELECT count(*) fromInvoice");

$sth->execute();
   while (@data = $sth->fetchrow_array())
  {
   print "count=$data[0]\n";
  }

exit;

If I go into the firefox add on and run a checkpoint , the 2 new invoices get 
flushed from the WL to the db file and then my program sees them
what am I doing wrong where my script is only looking at the db file and not 
the WAL file?


Are there multiple links in the file-system to the database file?

More information in section 2.5 here: 
http://www.sqlite.org/howtocorrupt.html


Dan.




thanks!

John
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Problem with method numRows() in Sqlite3

2013-09-23 Thread Tim Streater
On 23 Sep 2013 at 04:50, pisey phon  wrote: 

> I got an error "Call to undefined method SQLite3Result::numRows()". and here
> is my code:
> $db = new Sqlite3("sample.db");
> $result = $db->query("select * from table");
> $rows = $result->numRows();

Don't do that. Do:

$rows = count ($result);

--
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Problem with method numRows() in Sqlite3

2013-09-23 Thread Paolo Bolzoni
As far as I know the only way to know the number of returned
rows is scanning all the query result. Isn't it?

On Mon, Sep 23, 2013 at 5:48 PM, John McKown
 wrote:
> Basically, this is telling you that there is no such function as
> "Sqlite3::numRows". I did a fast scan of the sqlite3 documentation, and
> there does not appear to be a sqlite3 function which returns this type of
> information. The PHP interface appears to just be a wrapper for PHP around
> existing sqlite3 calls, and since sqlite3 does not implement this function,
> neither does the PHP interface.
>
> From my looking, such as it was, I would say that you need to either do a
> $result->fetcharray(), then see how big that array is. Or, if you don't
> really want the data, do a
> $result=$db->query("select count(*) as numRows from table");
>
> if you just want a count. Lastly, if you need to check the number of rows
> without fetching them, then you'll need to run both queries.
>
> Again, the above is "as best as I can tell". I am not a PHP expert.
>
>
> On Sun, Sep 22, 2013 at 10:50 PM, pisey phon  wrote:
>
>> Dear Sqlite Team
>>
>>
>> I have some problem with Sqlite3.
>>
>>
>> Would you mind if I want to ask you a question?
>>
>>
>> I got an error "Call to undefined method SQLite3Result::numRows()". and
>> here
>> is my code:
>> $db = new Sqlite3("sample.db");
>> $result = $db->query("select * from table");
>> $rows = $result->numRows();
>>
>>
>> Please help me. Thanks in advance.
>>
>>
>> Pisey Phon
>> Junior Web Developer.
>>
>>
>>
>> --
>> View this message in context:
>> http://sqlite.1065341.n5.nabble.com/Problem-with-method-numRows-in-Sqlite3-tp71420.html
>> Sent from the SQLite mailing list archive at Nabble.com.
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>
>
>
> --
> As of next week, passwords will be entered in Morse code.
>
> Maranatha! <><
> John McKown
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Compare Similar Cells

2013-09-23 Thread Kristopher Roy
Roger,
Thank you, I will start working on it per your suggestions, though I would
love to get mine cleaned up easily, I want to provide a solution that other
users can use :)


On Sun, Sep 22, 2013 at 11:14 PM, Roger Binns  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On 22/09/13 19:41, Kristopher Roy wrote:
> > I have a table of songs, several have similar titles I can't find where
> > to get started. I tried this but its not right. Select SongTitle,
> > COUNT(SongTitle) AS LIKE_COUNT FROM Songs
>
> I did work with a database that came from a company that sold music.
> Their source data came from the various record companies and was a
> complete mess.  (Yes record companies would make mistakes even for their
> own artists!)  For our purposes the data had to be denormalised,
> deduplicated and many items merged where the differences weren't important.
>
> Fixing up the data required probabilistic matching, and can't be done in
> simple SQL queries.  For example spelling mistakes had to be accounted
> for, truncations, case differences, punctuation differences, numeric
> differences (eg "Song One" vs "Song 1", "Album 3" vs "Album III", "Vol 4"
> vs "Volume 4."), mixes (eg "Song One" vs "Song One (Radio Edit)"),
> compositions of multiple artists or contributing to another artists songs
> so "artist" becomes murky, etc.
>
> This could only be achieved by processing all the data in advance.
> Essentially every artist had to be scored against every other to see if
> they were the same (but not similar), same for every album of that artist
> against their other albums, and finally of all the songs within each
> album.  It required a lot of inspecting the matches, finding anomalies,
> doing google searches to find canonical information, adding heuristics,
> making sure that heuristics changes did not break existing good matches,
> and endless repeats until things are good enough.
>
> If you are trying to do a good job, then you will need to do something
> like that.
>
> If you are trying to fix up your own collection, then first go in and fix
> all the meta data.  Musicbrainz is a good source for authoritative
> information and there are plenty of apps out there to help you edit and
> update tags.
>
> If you want a quick fix, then add another column to your SQLite database
> that contains the normalised song title.  You will need to iterate over
> all your existing data to calculate a normalised title.  For example
> convert to all upper case, remove all punctuation, convert multiple spaces
> to single, remove "digits -" as you gave in your example, truncating to 30
> characters, and whatever else is relevant for your data.  You can now do
> matching against the normalised title column for each title.
>
> Roger
>
>
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.12 (GNU/Linux)
>
> iEYEARECAAYFAlI/3EAACgkQmOOfHg372QSuHgCgla77zTSx5knJL036AMpU0Unx
> JnEAoJ9Cx/kocO3ue4xafKFkM7BVEviE
> =RDic
> -END PGP SIGNATURE-
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
Thank You,
Kristopher C. Roy
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Problem with method numRows() in Sqlite3

2013-09-23 Thread Paolo Bolzoni
You are using a binding, right?
Please, can you show the problem with the plain C interface?

On Mon, Sep 23, 2013 at 5:50 AM, pisey phon  wrote:
> Dear Sqlite Team
>
>
> I have some problem with Sqlite3.
>
>
> Would you mind if I want to ask you a question?
>
>
> I got an error "Call to undefined method SQLite3Result::numRows()". and here
> is my code:
> $db = new Sqlite3("sample.db");
> $result = $db->query("select * from table");
> $rows = $result->numRows();
>
>
> Please help me. Thanks in advance.
>
>
> Pisey Phon
> Junior Web Developer.
>
>
>
> --
> View this message in context: 
> http://sqlite.1065341.n5.nabble.com/Problem-with-method-numRows-in-Sqlite3-tp71420.html
> Sent from the SQLite mailing list archive at Nabble.com.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Problem with method numRows() in Sqlite3

2013-09-23 Thread John McKown
Basically, this is telling you that there is no such function as
"Sqlite3::numRows". I did a fast scan of the sqlite3 documentation, and
there does not appear to be a sqlite3 function which returns this type of
information. The PHP interface appears to just be a wrapper for PHP around
existing sqlite3 calls, and since sqlite3 does not implement this function,
neither does the PHP interface.

>From my looking, such as it was, I would say that you need to either do a
$result->fetcharray(), then see how big that array is. Or, if you don't
really want the data, do a
$result=$db->query("select count(*) as numRows from table");

if you just want a count. Lastly, if you need to check the number of rows
without fetching them, then you'll need to run both queries.

Again, the above is "as best as I can tell". I am not a PHP expert.


On Sun, Sep 22, 2013 at 10:50 PM, pisey phon  wrote:

> Dear Sqlite Team
>
>
> I have some problem with Sqlite3.
>
>
> Would you mind if I want to ask you a question?
>
>
> I got an error "Call to undefined method SQLite3Result::numRows()". and
> here
> is my code:
> $db = new Sqlite3("sample.db");
> $result = $db->query("select * from table");
> $rows = $result->numRows();
>
>
> Please help me. Thanks in advance.
>
>
> Pisey Phon
> Junior Web Developer.
>
>
>
> --
> View this message in context:
> http://sqlite.1065341.n5.nabble.com/Problem-with-method-numRows-in-Sqlite3-tp71420.html
> Sent from the SQLite mailing list archive at Nabble.com.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
As of next week, passwords will be entered in Morse code.

Maranatha! <><
John McKown
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] EncFs + sqlite3

2013-09-23 Thread Paolo Bolzoni
What do you mean with "if you do not ATTTACH databases"?

I should not use the ATTACH DATABASE command to add
a second database to a connection or I cannot use a db at
all?

On Mon, Sep 23, 2013 at 3:43 PM, Clemens Ladisch  wrote:
> Paolo Bolzoni wrote:
>> I was wondering, is using sqlite3 under EncFs safe?
>> For "safe" I mean is the db is strong against data corruption
>> as in a usual filesystem?
>
> EncFS implements the .fsync callback but not .fsyncdir, so the deletion
> of the master journal is not synchronized, so your data and transactions
> are safe in case of a crash only if you do not ATTTACH databases or if
> you are using a journaling file system.  (EncFS does not implement
> .lock, but it is not a network file system, so it can rely on the
> kernel's automatic handling of locking for local files.)
>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Sqlite3

2013-09-23 Thread Pisey Phon
Dear sqlite users

I have some problem with Sqlite3. Would you mind if I want to ask you a
question?

I got an error like this:
Call to undefined method Sqlite3Result:: numRows().

Here is my code:
$db = new Sqlite3("sample.db");
$result = $db->query("select * from table");
$rows = $result->numRows();


Any Solution, please help me.


Thanks in advance.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Problem with method numRows() in Sqlite3

2013-09-23 Thread pisey phon
Dear Sqlite Team


I have some problem with Sqlite3.


Would you mind if I want to ask you a question?


I got an error "Call to undefined method SQLite3Result::numRows()". and here
is my code:
$db = new Sqlite3("sample.db");
$result = $db->query("select * from table");
$rows = $result->numRows();


Please help me. Thanks in advance.


Pisey Phon
Junior Web Developer.



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Problem-with-method-numRows-in-Sqlite3-tp71420.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] EncFs + sqlite3

2013-09-23 Thread Clemens Ladisch
Paolo Bolzoni wrote:
> I was wondering, is using sqlite3 under EncFs safe?
> For "safe" I mean is the db is strong against data corruption
> as in a usual filesystem?

EncFS implements the .fsync callback but not .fsyncdir, so the deletion
of the master journal is not synchronized, so your data and transactions
are safe in case of a crash only if you do not ATTTACH databases or if
you are using a journaling file system.  (EncFS does not implement
.lock, but it is not a network file system, so it can rely on the
kernel's automatic handling of locking for local files.)


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] EncFs + sqlite3

2013-09-23 Thread Paolo Bolzoni
Dear list,
I was wondering, is using sqlite3 under EncFs safe?
For "safe" I mean is the db is strong against data corruption
as in a usual filesystem?

Do anyone has experience? It would be reasonably easy to
test?

Your faithfully,
Paolo
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Compare Similar Cells

2013-09-23 Thread Stephen Chrzanowski
To the OP:
I've got a tonne of media on my Drobo, and I feel your pain.  Between my
MP3s, pictures, videos and books, keeping tabs on file names, physical file
locations, its quite the nightmare, and by the sounds of it, Roger has been
exposed to this issue more so than I have been, and with my limited
experience with just my central repository, well, its a daunting and time
consuming task.

For MP3s, there are plenty of applications out there, both paid and free,
that will handle tags with ease, check file names, and I've even heard of
applications that will "listen" to each MP3 and see how close they sound to
each other.  A few years back, on my first initial attempt to organize my
MP3s, I used Winamp to edit the tags and rename the physical files,
however, moving the files to an appropriate directory was still a 'manual'
process IIRC, but the physical file management did become easier with the
file renames, and Google came in to help me track down lyrics and put the
name of a song to an MP3.  Then there's how iTunes handles naming files it
downloads.  *sighs and goes and curls up in a corner*

Basically, there is no magic bullet for organizing based on file names.  As
Roger mentions, a lot of it is going to be a manual process.

If you're developing an application that would at least get you started,
what I would do is this:
- Get a full directory list of all files to be taken into consideration and
put them into two tables.  One table for the paths that contain the files,
and another table to hold the file names.  FK between these two tables.
- Run through the list of file names and extract each word and put each
word into a third table, with another FK relationship to just the file name
table.  For proper normalization, I could also put a 4th table in the mix
to make a "many to many" relationship between the word list and file name
list, which I would ensure the third table has only unique words.
Depending on how you write and your preferences, this would be up to you.
- Have a sorted list of words (Table 3) in a list box on the GUI so that
when I click on a word, another list box is populated with the file names
and/or the paths of the files found.
- Have some kind of event/trigger that would allow me to rename, move, or
edit the MP3 ID3 tags of the file in the second list box.  Update the
database, word list (if needed) and continue on.

The reason for the word list is that even though you can search for a
particular word, you'd never really be able to ensure a 100% hit on every
file by doing a manual key entry, not to mention what you would type in
with proper spelling may show up in the word list as being spelled
incorrectly.  I would then go through each word entry and see if I can find
duplicate file names in the second list box.  I could also introduce
reading the ID tags directly from the MP3 and edit as is.

There are many ways to handle this behemoth task.  A query, to get you
started, like [ select * from FileList where FileName like '%demon%' ]
would be something towards what you'd have to use..
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLite Code Factory 13.9 released

2013-09-23 Thread SQL Maestro Group

Hi!

SQL Maestro Group announces the release of SQLite Code Factory 13.9, a 
powerful Windows GUI solution for querying SQLite databases and managing the 
data.

http://www.sqlmaestro.com/products/sqlite/codefactory/

Please note that before Oct 13 you can purchase all our products and bundles 
with up to 25% discount. Please find details at

http://www.sqlmaestro.com/purchase/

Top 10 new features:


1. Improved SQL Editor.
2. Updated SQL Dump wizard.
3. Keyboard-interactive SSH authentication.
4. New Start Page with quick access to recently connected databases.
5. Data import from several files to a single table.
6. Data export to the JSON format.
7. Enhanced data grids.
8. The Script Runner tool.
9. Colored profiles in Database Explorer.
10. New modern progress windows.

Full press-release (with explaining screenshots) is available at:
http://www.sqlmaestro.com/news/company/codefactories_updated_to_13_9/

Background information:

SQL Maestro Group offers complete database admin and management tools for 
MySQL, Oracle, MS SQL Server, DB2, PostgreSQL, SQL Anywhere, SQLite, 
Firebird and MaxDB providing the highest performance, scalability and 
reliability to meet the requirements of today's database applications.


Sincerely yours,
The SQL Maestro Group Team
http://www.sqlmaestro.com 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] multiple connection to the same DB

2013-09-23 Thread olivier Ménard
The administrator just told me, the server is on debian with CIFS, each machine 
are on Ubuntu.


i try to remember the whole session :

me :

sqlite3 birth.sqlite
create table people ...
insert into people  values('o','1')

and no closing sqlite

all other at the same time (but probably not at the same second), on different 
machines
sqlite 3 birth.sqlite

someone (name s):
insert into people values('s','2')

me : 
select * from people; got only 's','2'

someone else (name t)
insert into people values('t','3')

someone else (name u)
insert into people values('u','4')

me :
select * from people; got 't', '3' AND 'u', '4',   but 's','2' was not there












 De : Simon Slavin 
À : General Discussion of SQLite Database  
Envoyé le : Dimanche 22 septembre 2013 17h52
Objet : Re: [sqlite] multiple connection to the same DB
 


On 22 Sep 2013, at 4:50pm, Gerry Snyder  wrote:

> If I am right, then something else is happening, such as somehow using 
> different files, or something else deleting or otherwise modifying the file 
> between accesses.

Worth testing.  Have one user create an entry in the table then quit their 
program.  Then have the other user open the file, do a SELECT on the table and 
see whether the entry is there.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Compare Similar Cells

2013-09-23 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 22/09/13 19:41, Kristopher Roy wrote:
> I have a table of songs, several have similar titles I can't find where
> to get started. I tried this but its not right. Select SongTitle,
> COUNT(SongTitle) AS LIKE_COUNT FROM Songs

I did work with a database that came from a company that sold music.
Their source data came from the various record companies and was a
complete mess.  (Yes record companies would make mistakes even for their
own artists!)  For our purposes the data had to be denormalised,
deduplicated and many items merged where the differences weren't important.

Fixing up the data required probabilistic matching, and can't be done in
simple SQL queries.  For example spelling mistakes had to be accounted
for, truncations, case differences, punctuation differences, numeric
differences (eg "Song One" vs "Song 1", "Album 3" vs "Album III", "Vol 4"
vs "Volume 4."), mixes (eg "Song One" vs "Song One (Radio Edit)"),
compositions of multiple artists or contributing to another artists songs
so "artist" becomes murky, etc.

This could only be achieved by processing all the data in advance.
Essentially every artist had to be scored against every other to see if
they were the same (but not similar), same for every album of that artist
against their other albums, and finally of all the songs within each
album.  It required a lot of inspecting the matches, finding anomalies,
doing google searches to find canonical information, adding heuristics,
making sure that heuristics changes did not break existing good matches,
and endless repeats until things are good enough.

If you are trying to do a good job, then you will need to do something
like that.

If you are trying to fix up your own collection, then first go in and fix
all the meta data.  Musicbrainz is a good source for authoritative
information and there are plenty of apps out there to help you edit and
update tags.

If you want a quick fix, then add another column to your SQLite database
that contains the normalised song title.  You will need to iterate over
all your existing data to calculate a normalised title.  For example
convert to all upper case, remove all punctuation, convert multiple spaces
to single, remove "digits -" as you gave in your example, truncating to 30
characters, and whatever else is relevant for your data.  You can now do
matching against the normalised title column for each title.

Roger


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlI/3EAACgkQmOOfHg372QSuHgCgla77zTSx5knJL036AMpU0Unx
JnEAoJ9Cx/kocO3ue4xafKFkM7BVEviE
=RDic
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users