[libreoffice-users] Re: I have a question

2012-07-28 Thread Andreas Säger

Am 28.07.2012 14:17, Lynne Stevens wrote:



omega
 The
Omega sector
America's Last
Line of
Defense

*How do I check for duplicates in a data base using another data base ?



*




SELECT A.*
FROM A JOIN B ON A.ID = B.ID

returns all rows from table A with an equivalent ID in table B.


--
For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-users] Re: I have a question

2012-07-28 Thread Lynne Stevens



omega
The
Omega sector
America's Last
Line of
Defense

*That sounds like they will be merged ! . . They are to remain separate 
and the B file is to be used for a mailing thing after it has the 
duplicates removed . . the mailing thing I have figured out and what 
fields/columns to use ! !




*

On 07/28/2012 06:38 AM, Andreas Säger wrote:

Am 28.07.2012 14:17, Lynne Stevens wrote:



omega
 The
Omega sector
America's Last
Line of
Defense

*How do I check for duplicates in a data base using another data base ?



*




SELECT A.*
FROM A JOIN B ON A.ID = B.ID

returns all rows from table A with an equivalent ID in table B.





--
For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted



[libreoffice-users] Re: I have a question

2012-07-28 Thread Alexander Thurgood
Le 28/07/12 14:17, Lynne Stevens a écrit :

Hi Lynne,


 *How do I check for duplicates in a data base using another data base ?
 

By another data base, do you mean :

(a) another ODB file ?

(b) another type of database manager, e.g. Access vs Calc, or MySQL vs
Access, or even Access vs Access ?

(c) another table in the same ODB file or in the same database schema
managed by a unique database manager ?

(d) any or all of the above combined ?

As you can see, the possibilities belie what would appear to be a
seemingly simple question.


Andreas has answered (c) for you, i.e. where both tables are in the same
database schema.


There is no simple way to compare the contents of 2 HSQLDB/ODB files,
other than extracting the contents from one and importing it into the
other as a temporary or new table and then comparing the table results
from within the single ODB file using a JOIN query.

As for (b) LO Base does not allow you to adress two separate database
schema as far as I know from within a single query. I heard tell once
that it might be possible with Access tables being read by LO if an ODBC
connection was being used, and this only would work on Windows. Not
having tried it, I can't say whether it is true or not.


If what you are aiming for is indeed cross-schema queries in order to be
able to compare, this has been a requested (and never implemented)
feature in OpenOffice.org for many years.



Alex


-- 
For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-users] Re: I have a question

2012-07-28 Thread Dan

Comment inline:

Lynne Stevens wrote:



omega The Omega sector America's Last Line of Defense

*That sounds like they will be merged ! . . They are to remain
separate and the B file is to be used for a mailing thing after it
has the duplicates removed . . the mailing thing I have figured out
and what fields/columns to use ! !


 You need to be more specific in your description. You began with
two databases that you want to check for duplicates. It has now become 
clear that you are working with two tables. And then you mentioned 
fields. What fields exist in these two tables? What determines whether a 
row in the B table is a duplicate based upon what is contained in 
table A?
 The simple answer to your question: the same way that you would do 
it by hand. The only problem with this answer is that it does not 
include any specifics.
 If you would write down how you would do this by hand step by 
step, then we would know what has to be done step by step, and we could 
suggest how to do each step. You might even see how to do it by yourself.


--Dan


* On 07/28/2012 06:38 AM, Andreas Säger wrote:

Am 28.07.2012 14:17, Lynne Stevens wrote:



omega The Omega sector America's Last Line of Defense

*How do I check for duplicates in a data base using another data
base ?



*




SELECT A.* FROM A JOIN B ON A.ID = B.ID

returns all rows from table A with an equivalent ID in table B.








--
For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-users] Re: I have a question

2012-07-28 Thread drew jensen
On Sat, 2012-07-28 at 08:35 -0700, Lynne Stevens wrote:
 
 omega
   The
 Omega sector
 America's Last
 Line of
 Defense
 
 *That sounds like they will be merged ! . . They are to remain separate 
 and the B file is to be used for a mailing thing after it has the 
 duplicates removed . . the mailing thing I have figured out and what 
 fields/columns to use ! !

Hi Lynne, others

First - do you want to do this here or on the Ubuntu forums.. :)


If this where a relational database and two tables (with the same
columns structure) then it is a simple MINUS function:

SELECT * FROM TABLE1
MINUS
SELECT * FROM TABLE2

That statement will return all records in TABLE1 that are not also in
TABLE2.

However you have here 2 CSV files, and one of those files IIRC has a
column with a different name.

I don't thin you told folks here about how you are getting the data.

Andreas may have a good way to do that completely in Calc (he is much
better at Calc then I ever will be)

However is this is something you will need to do on a continuing basis
(with new CSV files arriving from time to time) then I would set this up
under Base and attach the two CSV files as TEXT tables. With that done,
then you can use the MINUS command.

It's a little bit of work to set up the connection, but one done you can
reuse it just by overwriting the source files with the data as need. 

//drew

 
 
 
 *
 
 On 07/28/2012 06:38 AM, Andreas Säger wrote:
  Am 28.07.2012 14:17, Lynne Stevens wrote:
 
 
  omega
   The
  Omega sector
  America's Last
  Line of
  Defense
 
  *How do I check for duplicates in a data base using another data base ?
 
 
 
  *
 
 
 
  SELECT A.*
  FROM A JOIN B ON A.ID = B.ID
 
  returns all rows from table A with an equivalent ID in table B.
 
 
 
 



-- 
For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-users] Re: I have a question

2012-07-28 Thread Lynne Stevens



omega
The
Omega sector
America's Last
Line of
Defense

*There are 2 FILES / Data bases each one has the same number of columns 
126 each has same information except in Column R which has a single 
letter denoting its status as to what kind of sale it is . . Like bank 
sale, for sale by owner, and for sale by a realestate place and so on !

*

*That seems strange as if you wanted information you would have to 
delete the duplicates
in any data base or it would grow by gobs . . I remember in WAY BACK 
time in Super Calc there was a way of doing it  and it is not open to 
Libre-office Calc ? . . haha that was on a big 3.5 inch disk the entire 
program hahahaha went into windows 3.1 . . 500 years ago . .



*

On 07/28/2012 09:39 AM, Alexander Thurgood wrote:

Le 28/07/12 14:17, Lynne Stevens a écrit :

Hi Lynne,



*How do I check for duplicates in a data base using another data base ?


By another data base, do you mean :

(a) another ODB file ?

(b) another type of database manager, e.g. Access vs Calc, or MySQL vs
Access, or even Access vs Access ?

(c) another table in the same ODB file or in the same database schema
managed by a unique database manager ?

(d) any or all of the above combined ?

As you can see, the possibilities belie what would appear to be a
seemingly simple question.


Andreas has answered (c) for you, i.e. where both tables are in the same
database schema.


There is no simple way to compare the contents of 2 HSQLDB/ODB files,
other than extracting the contents from one and importing it into the
other as a temporary or new table and then comparing the table results
from within the single ODB file using a JOIN query.

As for (b) LO Base does not allow you to adress two separate database
schema as far as I know from within a single query. I heard tell once
that it might be possible with Access tables being read by LO if an ODBC
connection was being used, and this only would work on Windows. Not
having tried it, I can't say whether it is true or not.


If what you are aiming for is indeed cross-schema queries in order to be
able to compare, this has been a requested (and never implemented)
feature in OpenOffice.org for many years.



Alex





--
For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-users] Re: I have a question

2012-07-28 Thread Lynne Stevens



omega
The
Omega sector
America's Last
Line of
Defense

*Oh yeah both files are CVS and are the way Thomas downloads them but 
being he is not to computer literate and most likely does not know he 
can save it in different ways . . No telling . . I ask one time and he 
said that is how it gets downloaded




*

On 07/28/2012 09:39 AM, Alexander Thurgood wrote:

Le 28/07/12 14:17, Lynne Stevens a écrit :

Hi Lynne,



*How do I check for duplicates in a data base using another data base ?


By another data base, do you mean :

(a) another ODB file ?

(b) another type of database manager, e.g. Access vs Calc, or MySQL vs
Access, or even Access vs Access ?

(c) another table in the same ODB file or in the same database schema
managed by a unique database manager ?

(d) any or all of the above combined ?

As you can see, the possibilities belie what would appear to be a
seemingly simple question.


Andreas has answered (c) for you, i.e. where both tables are in the same
database schema.


There is no simple way to compare the contents of 2 HSQLDB/ODB files,
other than extracting the contents from one and importing it into the
other as a temporary or new table and then comparing the table results
from within the single ODB file using a JOIN query.

As for (b) LO Base does not allow you to adress two separate database
schema as far as I know from within a single query. I heard tell once
that it might be possible with Access tables being read by LO if an ODBC
connection was being used, and this only would work on Windows. Not
having tried it, I can't say whether it is true or not.


If what you are aiming for is indeed cross-schema queries in order to be
able to compare, this has been a requested (and never implemented)
feature in OpenOffice.org for many years.



Alex





--
For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted



[libreoffice-users] Re: I have a question

2012-07-28 Thread Andreas Säger

Am 28.07.2012 22:03, Lynne Stevens wrote:



omega
 The
Omega sector
America's Last
Line of
Defense

*Ok Thomas downloads 2 Data bases ( CVS format ) he wants to remove
Duplicates from Data Base B using Data base A and THEN Data Base A gets
deleted when done
*



OK, that is

$ cat fileA.csv fileB.csv | sort | uniq  fileC.csv




--
For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-users] Re: I have a question

2012-07-28 Thread Andrew Brager

Not sure who asked the original question, nor what OS you're on but...

If you're using Linux and the files are in fact text files (CVS) as 
indicated below, then I would think something to the effect of:


cat file1 file2  file3 | sort | uniq  uniqueRecords.txt

Might do part of the trick.  I say might because it's been too many 
years since I hung up my Unix/Linux hat and I've forgotten most of the 
good stuff.  Sorry I can't be more helpful.


The above translates into English as: merge file1 and file2 into file3, 
sort file3 and then find those lines that are NOT duplicated (i.e. 
unique) and store them in uniqueRecords.txt. uniq has an option to 
return the opposite, i.e. those lines that ARE duplicates.


so then you'd do:
cat file3 | sort | uniq -d  duplicateRecords.txt

Now eliminate the duplicates from the duplicateRecords.txt file
cat duplicateRecords.txt | uniq  almostDone.txt

Finally, merge the first set of unique records with the unique duplicates.
cat uniqueRecords.txt almostDone.txt  done.csv

Here's the same thing without all the comments:
cat file1 file2  file3 | sort | uniq  uniqueRecords.txt
cat file3 | sort | uniq -d  duplicateRecords.txt
cat duplicateRecords.txt | uniq  almostDone.txt
cat uniqueRecords.txt almostDone.txt  done.csv

Not having a Unix/Linux system handy I can't test it, but you should see 
results like this if I got it right:

Assuming file1 is composed of:
x,y,z
a,b,c
d,e,f
and file2:
a,b,c
g,h,i
j,k,l

Results for each line above should be as follows:
1.
d,e,f
g,h,i
j,k,l
x,y,z

2.
a,b,c
a,b,c

3. a,b,c

4.
a,b,c
d,e,f
g,h,i
j,k,l
x,y,z


Another possible tool to look into is AWK.  If you know it, it could be 
even simpler than the above 4 line script.
diff and/or one of it's variants might be another potential tool to get 
the job done.



On 7/28/2012 12:42 PM, Lynne Stevens wrote:



omega
The
Omega sector
America's Last
Line of
Defense

*Oh yeah both files are CVS and are the way Thomas downloads them but 
being he is not to computer literate and most likely does not know he 
can save it in different ways . . No telling . . I ask one time and he 
said that is how it gets downloaded




*

On 07/28/2012 09:39 AM, Alexander Thurgood wrote:

Le 28/07/12 14:17, Lynne Stevens a écrit :

Hi Lynne,



*How do I check for duplicates in a data base using another data base ?









--
For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-users] Re: I have a question

2012-07-28 Thread Lynne Stevens



omega
The
Omega sector
America's Last
Line of
Defense

*Hey that sounds good to me I will have Thomas bring over a set of the 
files

and I will use it . . will make Thomas jump up and down and happy



*

On 07/28/2012 01:28 PM, Andreas Säger wrote:

Am 28.07.2012 22:03, Lynne Stevens wrote:



omega
 The
Omega sector
America's Last
Line of
Defense

*Ok Thomas downloads 2 Data bases ( CVS format ) he wants to remove
Duplicates from Data Base B using Data base A and THEN Data Base A gets
deleted when done
*



OK, that is

$ cat fileA.csv fileB.csv | sort | uniq  fileC.csv







--
For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-users] Re: I have a question

2012-07-28 Thread Johnny Rosenberg
2012/7/28 Andreas Säger ville...@t-online.de:
 Am 28.07.2012 22:03, Lynne Stevens wrote:



 omega
  The
 Omega sector
 America's Last
 Line of
 Defense

 *Ok Thomas downloads 2 Data bases ( CVS format ) he wants to remove
 Duplicates from Data Base B using Data base A and THEN Data Base A gets
 deleted when done
 *


 OK, that is

 $ cat fileA.csv fileB.csv | sort | uniq  fileC.csv
And then:
$ rm fileA*



Kind regards

Johnny Rosenberg
ジョニー・ローゼンバーグ

-- 
For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted


Re: [libreoffice-users] Re: I have a question

2012-07-28 Thread Jay Lozier

On 07/28/2012 03:42 PM, Lynne Stevens wrote:



omega
The
Omega sector
America's Last
Line of
Defense

*Oh yeah both files are CVS and are the way Thomas downloads them but 
being he is not to computer literate and most likely does not know he 
can save it in different ways . . No telling . . I ask one time and he 
said that is how it gets downloaded

Lynne,

What I think you are doing is loading two tables from csv files into the 
*same* database. With databases one finds there is a fairly precise 
terminology used. Once you have the data loaded then on can query the 
data. The tables do not need to have the same structure only be related 
in some logical manner.


If I understand the project you download a CSV file with addresses that 
has many duplicates of data in a table in the database. The next step is 
find only those addresses that are not duplicates so they can be 
appended to the existing table.


If you upload the new data into a table (Temp) you can then write a 
query to find new/updated entries in the existing table (Addresses).


Roughly the query would look something like this

SELECT relevant columns from Temp, each column listed in the following 
format t.columnname
FROM Temp AS t, Addresses AS a (using an alias you do not need to type 
the full table name)
WHERE t.columnname != a.columnname - more than one criteria can be used 
such as customer name, address, city, phone number, etc to identify the 
duplicates.


For the WHERE clause I would look for some unique id used in the data 
sets, such as a customer number.




*

On 07/28/2012 09:39 AM, Alexander Thurgood wrote:

Le 28/07/12 14:17, Lynne Stevens a écrit :

Hi Lynne,



*How do I check for duplicates in a data base using another data base ?


By another data base, do you mean :

(a) another ODB file ?

(b) another type of database manager, e.g. Access vs Calc, or MySQL vs
Access, or even Access vs Access ?

(c) another table in the same ODB file or in the same database schema
managed by a unique database manager ?

(d) any or all of the above combined ?

As you can see, the possibilities belie what would appear to be a
seemingly simple question.


Andreas has answered (c) for you, i.e. where both tables are in the same
database schema.


There is no simple way to compare the contents of 2 HSQLDB/ODB files,
other than extracting the contents from one and importing it into the
other as a temporary or new table and then comparing the table results
from within the single ODB file using a JOIN query.

As for (b) LO Base does not allow you to adress two separate database
schema as far as I know from within a single query. I heard tell once
that it might be possible with Access tables being read by LO if an ODBC
connection was being used, and this only would work on Windows. Not
having tried it, I can't say whether it is true or not.


If what you are aiming for is indeed cross-schema queries in order to be
able to compare, this has been a requested (and never implemented)
feature in OpenOffice.org for many years.



Alex








--
Jay Lozier
jsloz...@gmail.com


--
For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted



Re: [libreoffice-users] Re: I have a question

2012-07-28 Thread Lynne Stevens



omega
The
Omega sector
America's Last
Line of
Defense

*Since it has the recorder office number ( column A ) on the line of all 
the John Kings and if the number is the same it will be deleted but if 
not and there are two different John's and it has a different recorders 
office number it will stay which is good . . and the column R has a 
letter which says it is not the same . . there are 2 columns that are in 
this which says which one goes and which stays . . the recorders office 
number Column A and the Status Column R are the ones which do all the 
work rest are the same stuff . .



*

On 07/28/2012 03:24 PM, Andrew Brager wrote:
Keep in mind that you do in fact lose the duplicates completely.  I 
would think you want to keep one of the duplicates.


So for example if the merged files give you 4 records total:

John King
John King
Peter Rabbit
Bugs Bunny


You end up with 2 records - Peter Rabbit and Bugs Bunny - losing John 
King completely.  The solution I presented returns all 3 unique 
records,  losing just one of the repeated names instead of both, which 
to me makes more sense.  Obviously you know your application better 
than I do, just want to make sure you're not surprised when 25% (or 
whatever percentage) of your database goes missing.



On 7/28/2012 2:22 PM, Lynne Stevens wrote:




*Hey that sounds good to me I will have Thomas bring over a set of 
the files

and I will use it . . will make Thomas jump up and down and happy



Defense

*Ok Thomas downloads 2 Data bases ( CVS format ) he wants to remove
Duplicates from Data Base B using Data base A and THEN Data Base A 
gets

deleted when done
*



OK, that is

$ cat fileA.csv fileB.csv | sort | uniq  fileC.csv













--
For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted