Re: Fwd: some transformations on file
Hi Paul, yes, it's exactly the way i try to follow. I've my algorithm but it's a bit hard, (for the moment but i hope to have more time to learn perl) to write a good code. samuel 2013/1/21 Paul Hoffman nkui...@nkuitse.com On Sun, Jan 20, 2013 at 06:43:38PM +0100, samuel desseaux wrote: *the goal is to join properly items with biblio records. Let's assume that you have these two files: (B) Three MARC bibliographic records 1. 001 = 1029 2. 001 = 3884 3. 001 = 1650 (etc.) (I) Seven MARC item records 1. 001 = 1029 2. 001 = 1650 3. 001 = 1029 4. 001 = 3884 5. 001 = 3884 6. 001 = 1650 7. 001 = 1650 Do you want to produce a *new* file of three records, like this? 1. I1 + I3 2. I4 + I5 3. I2 + I6 + I7 Is this really what you want to have in the end? As we have to separate files, it's a bit hard. With MarcEdit, if i merge these two files, it's limited: marcedit doesn't understand that one biblio record can have more than one item :-). I won't say any more about my library and his exotical old ils i've moved for koha. It sounds as though what you *really* want in the end is a *single* file of three MARC records, like this: B1 + I1 + I3 B2 + I4 + I5 B3 + I2 + I6 + I7 Is that right? Here's a rough start in Perl: 8888888 use MARC::File; my ($file, %records); $file = MARC::File::USMARC-in($bib_records_file); while (my $bib_marc = read_next_record_from($file) { my $sysnum = sysnum($bib_marc); $records{$sysnum} = [ $bib_marc ]; } $file-close; $file = MARC::File::USMARC-in($bib_records_file); while (my $item_marc = read_next_record_from($file) { my $sysnum = sysnum($item_marc); push @{ $records{$sysnum} }, $item_marc; } $file-close; print @$_ for values %records; 8888888 Let us know if you need help writing read_next_record_from() or sysnum(). Paul. -- Paul Hoffman nkui...@nkuitse.com
Re: Fwd: some transformations on file
hello, Just 2 notes about your attached content: * please don't do that on mailing list: it's unsolicited content. provide download urls instead. * those are not marc files so the exemples given below don't work as long as you haven't translated it to iso2709. On Sun, Jan 20, 2013 at 03:54:18PM +0100, samuel desseaux wrote: Hi, I work on files for our library and i need some help. I have one file with all biblio records and one with items. A biblio record can have one or more than one item. First operation: i want to compare the two files and the identifier is the field 001. I want to have th results in two separates files 1st: all the items which have the same 001 field like in the biblio record 2nd: all the items which have not the same 001 field like in the biblio record not tested but here is a good base: use Modern::Perl; use autodie; use MARC::MIR; my %biblio; my %report; map { open $report{$_},$_.matches.txt } qw do dont ; marawk { $biblio{(record_id)}=1 } 'biblio.mrc'; marawk { my $id = record_id; my $as = $biblio_id{ $to } ? 'do' : 'dont'; say $report{$as}, $id; } 'items.mrc'; Second operation: In my item files, all items of a same biblio record have the same 001 field but they are all separated. I'd like to join all the items under only one 001 field a) be carefull: it will load the whole file in memory b) not tested :) use Modern::Perl; use autodie; use MARC::MIR; my %items_for; marawk { push @ { $items_for{(record_id)} } , $_ } 'items.mrc'; open my $fh,'sorted.items.mrc'; map { map {print $fh to_iso2709} @$_ } values %items_for; After, with the new file, i want to merge with biblio record and if i find 2 identical 001, i attached the items on the biblio record i don't get it. you want to merge item records and biblio record? Third operation: how can i correct some data bad encoded. It's due to the old database which doesn't respect UTF8. i see no problem in the provided content. regards marc
Re: Fwd: some transformations on file
hello, tu peux me redonner un lien vers les fichiers marc ? cordialement, marc -- Marc Chantreux Université de Strasbourg, Direction Informatique 14 Rue René Descartes, 67084 STRASBOURG CEDEX ☎: 03.68.85.57.40 http://unistra.fr Don't believe everything you read on the Internet -- Abraham Lincoln
Re: Fwd: some transformations on file
oops! sorry about it: bad destination On Sun, Jan 20, 2013 at 06:14:20PM +0100, Marc Chantreux wrote: hello, tu peux me redonner un lien vers les fichiers marc ? cordialement, marc -- Marc Chantreux Université de Strasbourg, Direction Informatique 14 Rue René Descartes, 67084 STRASBOURG CEDEX ☎: 03.68.85.57.40 http://unistra.fr Don't believe everything you read on the Internet -- Abraham Lincoln -- Marc Chantreux Université de Strasbourg, Direction Informatique 14 Rue René Descartes, 67084 STRASBOURG CEDEX ☎: 03.68.85.57.40 http://unistra.fr Don't believe everything you read on the Internet -- Abraham Lincoln
Re: Fwd: some transformations on file
* if it's a better solution, i will put my files(converted in iso2709) on dropbox, *the goal is to join properly items with biblio records. As we have to separate files, it's a bit hard. With MarcEdit, if i merge these two files, it's limited: marcedit doesn't understand that one biblio record can have more than one item :-). I won't say any more about my library and his exotical old ils i've moved for koha. 2013/1/20 Marc Chantreux m...@unistra.fr hello, Just 2 notes about your attached content: * please don't do that on mailing list: it's unsolicited content. provide download urls instead. * those are not marc files so the exemples given below don't work as long as you haven't translated it to iso2709. On Sun, Jan 20, 2013 at 03:54:18PM +0100, samuel desseaux wrote: Hi, I work on files for our library and i need some help. I have one file with all biblio records and one with items. A biblio record can have one or more than one item. First operation: i want to compare the two files and the identifier is the field 001. I want to have th results in two separates files 1st: all the items which have the same 001 field like in the biblio record 2nd: all the items which have not the same 001 field like in the biblio record not tested but here is a good base: use Modern::Perl; use autodie; use MARC::MIR; my %biblio; my %report; map { open $report{$_},$_.matches.txt } qw do dont ; marawk { $biblio{(record_id)}=1 } 'biblio.mrc'; marawk { my $id = record_id; my $as = $biblio_id{ $to } ? 'do' : 'dont'; say $report{$as}, $id; } 'items.mrc'; Second operation: In my item files, all items of a same biblio record have the same 001 field but they are all separated. I'd like to join all the items under only one 001 field a) be carefull: it will load the whole file in memory b) not tested :) use Modern::Perl; use autodie; use MARC::MIR; my %items_for; marawk { push @ { $items_for{(record_id)} } , $_ } 'items.mrc'; open my $fh,'sorted.items.mrc'; map { map {print $fh to_iso2709} @$_ } values %items_for; After, with the new file, i want to merge with biblio record and if i find 2 identical 001, i attached the items on the biblio record i don't get it. you want to merge item records and biblio record? Third operation: how can i correct some data bad encoded. It's due to the old database which doesn't respect UTF8. i see no problem in the provided content. regards marc
Re: Fwd: some transformations on file
On Sun, Jan 20, 2013 at 06:43:38PM +0100, samuel desseaux wrote: *the goal is to join properly items with biblio records. Let's assume that you have these two files: (B) Three MARC bibliographic records 1. 001 = 1029 2. 001 = 3884 3. 001 = 1650 (etc.) (I) Seven MARC item records 1. 001 = 1029 2. 001 = 1650 3. 001 = 1029 4. 001 = 3884 5. 001 = 3884 6. 001 = 1650 7. 001 = 1650 Do you want to produce a *new* file of three records, like this? 1. I1 + I3 2. I4 + I5 3. I2 + I6 + I7 Is this really what you want to have in the end? As we have to separate files, it's a bit hard. With MarcEdit, if i merge these two files, it's limited: marcedit doesn't understand that one biblio record can have more than one item :-). I won't say any more about my library and his exotical old ils i've moved for koha. It sounds as though what you *really* want in the end is a *single* file of three MARC records, like this: B1 + I1 + I3 B2 + I4 + I5 B3 + I2 + I6 + I7 Is that right? Here's a rough start in Perl: 8888888 use MARC::File; my ($file, %records); $file = MARC::File::USMARC-in($bib_records_file); while (my $bib_marc = read_next_record_from($file) { my $sysnum = sysnum($bib_marc); $records{$sysnum} = [ $bib_marc ]; } $file-close; $file = MARC::File::USMARC-in($bib_records_file); while (my $item_marc = read_next_record_from($file) { my $sysnum = sysnum($item_marc); push @{ $records{$sysnum} }, $item_marc; } $file-close; print @$_ for values %records; 8888888 Let us know if you need help writing read_next_record_from() or sysnum(). Paul. -- Paul Hoffman nkui...@nkuitse.com