[libreoffice-users] Re: I have a question
Am 28.07.2012 14:17, Lynne Stevens wrote: omega The Omega sector America's Last Line of Defense *How do I check for duplicates in a data base using another data base ? * SELECT A.* FROM A JOIN B ON A.ID = B.ID returns all rows from table A with an equivalent ID in table B. -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: I have a question
omega The Omega sector America's Last Line of Defense *That sounds like they will be merged ! . . They are to remain separate and the B file is to be used for a mailing thing after it has the duplicates removed . . the mailing thing I have figured out and what fields/columns to use ! ! * On 07/28/2012 06:38 AM, Andreas Säger wrote: Am 28.07.2012 14:17, Lynne Stevens wrote: omega The Omega sector America's Last Line of Defense *How do I check for duplicates in a data base using another data base ? * SELECT A.* FROM A JOIN B ON A.ID = B.ID returns all rows from table A with an equivalent ID in table B. -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
[libreoffice-users] Re: I have a question
Le 28/07/12 14:17, Lynne Stevens a écrit : Hi Lynne, *How do I check for duplicates in a data base using another data base ? By another data base, do you mean : (a) another ODB file ? (b) another type of database manager, e.g. Access vs Calc, or MySQL vs Access, or even Access vs Access ? (c) another table in the same ODB file or in the same database schema managed by a unique database manager ? (d) any or all of the above combined ? As you can see, the possibilities belie what would appear to be a seemingly simple question. Andreas has answered (c) for you, i.e. where both tables are in the same database schema. There is no simple way to compare the contents of 2 HSQLDB/ODB files, other than extracting the contents from one and importing it into the other as a temporary or new table and then comparing the table results from within the single ODB file using a JOIN query. As for (b) LO Base does not allow you to adress two separate database schema as far as I know from within a single query. I heard tell once that it might be possible with Access tables being read by LO if an ODBC connection was being used, and this only would work on Windows. Not having tried it, I can't say whether it is true or not. If what you are aiming for is indeed cross-schema queries in order to be able to compare, this has been a requested (and never implemented) feature in OpenOffice.org for many years. Alex -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: I have a question
Comment inline: Lynne Stevens wrote: omega The Omega sector America's Last Line of Defense *That sounds like they will be merged ! . . They are to remain separate and the B file is to be used for a mailing thing after it has the duplicates removed . . the mailing thing I have figured out and what fields/columns to use ! ! You need to be more specific in your description. You began with two databases that you want to check for duplicates. It has now become clear that you are working with two tables. And then you mentioned fields. What fields exist in these two tables? What determines whether a row in the B table is a duplicate based upon what is contained in table A? The simple answer to your question: the same way that you would do it by hand. The only problem with this answer is that it does not include any specifics. If you would write down how you would do this by hand step by step, then we would know what has to be done step by step, and we could suggest how to do each step. You might even see how to do it by yourself. --Dan * On 07/28/2012 06:38 AM, Andreas Säger wrote: Am 28.07.2012 14:17, Lynne Stevens wrote: omega The Omega sector America's Last Line of Defense *How do I check for duplicates in a data base using another data base ? * SELECT A.* FROM A JOIN B ON A.ID = B.ID returns all rows from table A with an equivalent ID in table B. -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: I have a question
On Sat, 2012-07-28 at 08:35 -0700, Lynne Stevens wrote: omega The Omega sector America's Last Line of Defense *That sounds like they will be merged ! . . They are to remain separate and the B file is to be used for a mailing thing after it has the duplicates removed . . the mailing thing I have figured out and what fields/columns to use ! ! Hi Lynne, others First - do you want to do this here or on the Ubuntu forums.. :) If this where a relational database and two tables (with the same columns structure) then it is a simple MINUS function: SELECT * FROM TABLE1 MINUS SELECT * FROM TABLE2 That statement will return all records in TABLE1 that are not also in TABLE2. However you have here 2 CSV files, and one of those files IIRC has a column with a different name. I don't thin you told folks here about how you are getting the data. Andreas may have a good way to do that completely in Calc (he is much better at Calc then I ever will be) However is this is something you will need to do on a continuing basis (with new CSV files arriving from time to time) then I would set this up under Base and attach the two CSV files as TEXT tables. With that done, then you can use the MINUS command. It's a little bit of work to set up the connection, but one done you can reuse it just by overwriting the source files with the data as need. //drew * On 07/28/2012 06:38 AM, Andreas Säger wrote: Am 28.07.2012 14:17, Lynne Stevens wrote: omega The Omega sector America's Last Line of Defense *How do I check for duplicates in a data base using another data base ? * SELECT A.* FROM A JOIN B ON A.ID = B.ID returns all rows from table A with an equivalent ID in table B. -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: I have a question
omega The Omega sector America's Last Line of Defense *There are 2 FILES / Data bases each one has the same number of columns 126 each has same information except in Column R which has a single letter denoting its status as to what kind of sale it is . . Like bank sale, for sale by owner, and for sale by a realestate place and so on ! * *That seems strange as if you wanted information you would have to delete the duplicates in any data base or it would grow by gobs . . I remember in WAY BACK time in Super Calc there was a way of doing it and it is not open to Libre-office Calc ? . . haha that was on a big 3.5 inch disk the entire program hahahaha went into windows 3.1 . . 500 years ago . . * On 07/28/2012 09:39 AM, Alexander Thurgood wrote: Le 28/07/12 14:17, Lynne Stevens a écrit : Hi Lynne, *How do I check for duplicates in a data base using another data base ? By another data base, do you mean : (a) another ODB file ? (b) another type of database manager, e.g. Access vs Calc, or MySQL vs Access, or even Access vs Access ? (c) another table in the same ODB file or in the same database schema managed by a unique database manager ? (d) any or all of the above combined ? As you can see, the possibilities belie what would appear to be a seemingly simple question. Andreas has answered (c) for you, i.e. where both tables are in the same database schema. There is no simple way to compare the contents of 2 HSQLDB/ODB files, other than extracting the contents from one and importing it into the other as a temporary or new table and then comparing the table results from within the single ODB file using a JOIN query. As for (b) LO Base does not allow you to adress two separate database schema as far as I know from within a single query. I heard tell once that it might be possible with Access tables being read by LO if an ODBC connection was being used, and this only would work on Windows. Not having tried it, I can't say whether it is true or not. If what you are aiming for is indeed cross-schema queries in order to be able to compare, this has been a requested (and never implemented) feature in OpenOffice.org for many years. Alex -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: I have a question
omega The Omega sector America's Last Line of Defense *Oh yeah both files are CVS and are the way Thomas downloads them but being he is not to computer literate and most likely does not know he can save it in different ways . . No telling . . I ask one time and he said that is how it gets downloaded * On 07/28/2012 09:39 AM, Alexander Thurgood wrote: Le 28/07/12 14:17, Lynne Stevens a écrit : Hi Lynne, *How do I check for duplicates in a data base using another data base ? By another data base, do you mean : (a) another ODB file ? (b) another type of database manager, e.g. Access vs Calc, or MySQL vs Access, or even Access vs Access ? (c) another table in the same ODB file or in the same database schema managed by a unique database manager ? (d) any or all of the above combined ? As you can see, the possibilities belie what would appear to be a seemingly simple question. Andreas has answered (c) for you, i.e. where both tables are in the same database schema. There is no simple way to compare the contents of 2 HSQLDB/ODB files, other than extracting the contents from one and importing it into the other as a temporary or new table and then comparing the table results from within the single ODB file using a JOIN query. As for (b) LO Base does not allow you to adress two separate database schema as far as I know from within a single query. I heard tell once that it might be possible with Access tables being read by LO if an ODBC connection was being used, and this only would work on Windows. Not having tried it, I can't say whether it is true or not. If what you are aiming for is indeed cross-schema queries in order to be able to compare, this has been a requested (and never implemented) feature in OpenOffice.org for many years. Alex -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
[libreoffice-users] Re: I have a question
Am 28.07.2012 22:03, Lynne Stevens wrote: omega The Omega sector America's Last Line of Defense *Ok Thomas downloads 2 Data bases ( CVS format ) he wants to remove Duplicates from Data Base B using Data base A and THEN Data Base A gets deleted when done * OK, that is $ cat fileA.csv fileB.csv | sort | uniq fileC.csv -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: I have a question
Not sure who asked the original question, nor what OS you're on but... If you're using Linux and the files are in fact text files (CVS) as indicated below, then I would think something to the effect of: cat file1 file2 file3 | sort | uniq uniqueRecords.txt Might do part of the trick. I say might because it's been too many years since I hung up my Unix/Linux hat and I've forgotten most of the good stuff. Sorry I can't be more helpful. The above translates into English as: merge file1 and file2 into file3, sort file3 and then find those lines that are NOT duplicated (i.e. unique) and store them in uniqueRecords.txt. uniq has an option to return the opposite, i.e. those lines that ARE duplicates. so then you'd do: cat file3 | sort | uniq -d duplicateRecords.txt Now eliminate the duplicates from the duplicateRecords.txt file cat duplicateRecords.txt | uniq almostDone.txt Finally, merge the first set of unique records with the unique duplicates. cat uniqueRecords.txt almostDone.txt done.csv Here's the same thing without all the comments: cat file1 file2 file3 | sort | uniq uniqueRecords.txt cat file3 | sort | uniq -d duplicateRecords.txt cat duplicateRecords.txt | uniq almostDone.txt cat uniqueRecords.txt almostDone.txt done.csv Not having a Unix/Linux system handy I can't test it, but you should see results like this if I got it right: Assuming file1 is composed of: x,y,z a,b,c d,e,f and file2: a,b,c g,h,i j,k,l Results for each line above should be as follows: 1. d,e,f g,h,i j,k,l x,y,z 2. a,b,c a,b,c 3. a,b,c 4. a,b,c d,e,f g,h,i j,k,l x,y,z Another possible tool to look into is AWK. If you know it, it could be even simpler than the above 4 line script. diff and/or one of it's variants might be another potential tool to get the job done. On 7/28/2012 12:42 PM, Lynne Stevens wrote: omega The Omega sector America's Last Line of Defense *Oh yeah both files are CVS and are the way Thomas downloads them but being he is not to computer literate and most likely does not know he can save it in different ways . . No telling . . I ask one time and he said that is how it gets downloaded * On 07/28/2012 09:39 AM, Alexander Thurgood wrote: Le 28/07/12 14:17, Lynne Stevens a écrit : Hi Lynne, *How do I check for duplicates in a data base using another data base ? -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: I have a question
omega The Omega sector America's Last Line of Defense *Hey that sounds good to me I will have Thomas bring over a set of the files and I will use it . . will make Thomas jump up and down and happy * On 07/28/2012 01:28 PM, Andreas Säger wrote: Am 28.07.2012 22:03, Lynne Stevens wrote: omega The Omega sector America's Last Line of Defense *Ok Thomas downloads 2 Data bases ( CVS format ) he wants to remove Duplicates from Data Base B using Data base A and THEN Data Base A gets deleted when done * OK, that is $ cat fileA.csv fileB.csv | sort | uniq fileC.csv -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: I have a question
2012/7/28 Andreas Säger ville...@t-online.de: Am 28.07.2012 22:03, Lynne Stevens wrote: omega The Omega sector America's Last Line of Defense *Ok Thomas downloads 2 Data bases ( CVS format ) he wants to remove Duplicates from Data Base B using Data base A and THEN Data Base A gets deleted when done * OK, that is $ cat fileA.csv fileB.csv | sort | uniq fileC.csv And then: $ rm fileA* Kind regards Johnny Rosenberg ジョニー・ローゼンバーグ -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: I have a question
On 07/28/2012 03:42 PM, Lynne Stevens wrote: omega The Omega sector America's Last Line of Defense *Oh yeah both files are CVS and are the way Thomas downloads them but being he is not to computer literate and most likely does not know he can save it in different ways . . No telling . . I ask one time and he said that is how it gets downloaded Lynne, What I think you are doing is loading two tables from csv files into the *same* database. With databases one finds there is a fairly precise terminology used. Once you have the data loaded then on can query the data. The tables do not need to have the same structure only be related in some logical manner. If I understand the project you download a CSV file with addresses that has many duplicates of data in a table in the database. The next step is find only those addresses that are not duplicates so they can be appended to the existing table. If you upload the new data into a table (Temp) you can then write a query to find new/updated entries in the existing table (Addresses). Roughly the query would look something like this SELECT relevant columns from Temp, each column listed in the following format t.columnname FROM Temp AS t, Addresses AS a (using an alias you do not need to type the full table name) WHERE t.columnname != a.columnname - more than one criteria can be used such as customer name, address, city, phone number, etc to identify the duplicates. For the WHERE clause I would look for some unique id used in the data sets, such as a customer number. * On 07/28/2012 09:39 AM, Alexander Thurgood wrote: Le 28/07/12 14:17, Lynne Stevens a écrit : Hi Lynne, *How do I check for duplicates in a data base using another data base ? By another data base, do you mean : (a) another ODB file ? (b) another type of database manager, e.g. Access vs Calc, or MySQL vs Access, or even Access vs Access ? (c) another table in the same ODB file or in the same database schema managed by a unique database manager ? (d) any or all of the above combined ? As you can see, the possibilities belie what would appear to be a seemingly simple question. Andreas has answered (c) for you, i.e. where both tables are in the same database schema. There is no simple way to compare the contents of 2 HSQLDB/ODB files, other than extracting the contents from one and importing it into the other as a temporary or new table and then comparing the table results from within the single ODB file using a JOIN query. As for (b) LO Base does not allow you to adress two separate database schema as far as I know from within a single query. I heard tell once that it might be possible with Access tables being read by LO if an ODBC connection was being used, and this only would work on Windows. Not having tried it, I can't say whether it is true or not. If what you are aiming for is indeed cross-schema queries in order to be able to compare, this has been a requested (and never implemented) feature in OpenOffice.org for many years. Alex -- Jay Lozier jsloz...@gmail.com -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted
Re: [libreoffice-users] Re: I have a question
omega The Omega sector America's Last Line of Defense *Since it has the recorder office number ( column A ) on the line of all the John Kings and if the number is the same it will be deleted but if not and there are two different John's and it has a different recorders office number it will stay which is good . . and the column R has a letter which says it is not the same . . there are 2 columns that are in this which says which one goes and which stays . . the recorders office number Column A and the Status Column R are the ones which do all the work rest are the same stuff . . * On 07/28/2012 03:24 PM, Andrew Brager wrote: Keep in mind that you do in fact lose the duplicates completely. I would think you want to keep one of the duplicates. So for example if the merged files give you 4 records total: John King John King Peter Rabbit Bugs Bunny You end up with 2 records - Peter Rabbit and Bugs Bunny - losing John King completely. The solution I presented returns all 3 unique records, losing just one of the repeated names instead of both, which to me makes more sense. Obviously you know your application better than I do, just want to make sure you're not surprised when 25% (or whatever percentage) of your database goes missing. On 7/28/2012 2:22 PM, Lynne Stevens wrote: *Hey that sounds good to me I will have Thomas bring over a set of the files and I will use it . . will make Thomas jump up and down and happy Defense *Ok Thomas downloads 2 Data bases ( CVS format ) he wants to remove Duplicates from Data Base B using Data base A and THEN Data Base A gets deleted when done * OK, that is $ cat fileA.csv fileB.csv | sort | uniq fileC.csv -- For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted