On Mon, 2002-03-04 at 10:32, Brett W. McCoy wrote: > On Mon, 4 Mar 2002, Dave Adams wrote: > > > I have a problem which I would like to use perl to resolve, however I'm not > > sure if it is possible to do. > > > > I need to scan a file and check some conditions, first if field 9 is > > duplicated on 1 or more rows, then I need to check field 10 to see which is > > the greater value and then only print the whole row where field 10 is the > > greater, if field 9 is not a duplicate then print the whole row. > > An example of the data is below. > > > > 28525|U2|4CY0|50|6775.15|2002-02-07|10461|321.43|102040724|102040773| > > 28526|U2|4CY0|25|3571.78|2002-02-07|6107|167.74|102040774|102040798| > > 28527|U2|4CY0|50|6930.3|2002-02-07|11376|324.12|102040774|102040823| > > 28528|U2|4CY0|25|4640.28|2002-02-07|4800|217.43|102040824|102040848| > > 28529|U2|4CY0|50|8432.05|2002-02-07|9023|392.03|102040824|102040873| > > Of course this can be done with Perl, although the algorithm will take > some thinking through. You should be able to read these rows into an > array (using split, et al). > > However, this will be much easier if you can get this into a real > database system so you can use DBI. In fact, a text file like this can > even be used with DBI -- take a look at DBD:CSV. It lets you create and > manipulate tables via SQL, which would make your problem much simpler > (although DBD::CSV may have problems enforcing unique constraints). > Otherwise, if you have a good bit of data to deal with, take a look at a > database system like PostgreSQL, which can easily handle the logic you are > trying to implement. > > -- Brett > http://www.chapelperilous.net/ > ------------------------------------------------------------------------ > To do nothing is to be nothing.
As a DBA I disagree. Text processing is often faster and easier to write. Relational Databases are very good at somethings: referential integrity, interactivity, data mining, etc. Relational Databases absolutely suck at this sort of self referential question where one value in the table . The first obvious solution using a RDB (build a cursor of the data and looking at the ninth and tenth fields) is no better than the Perl solution -- in fact is much slower since there are multiple sql calls that must execute. Besides, based on the look of the file (pipe delimited, with one pipe per column) I think the data was already in a DB. -- Today is Pungenday the 63rd day of Chaos in the YOLD 3168 Grudnuk demand sustenance! Missile Address: 33:48:3.521N 84:23:34.786W -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]