I'm trying to process a DOS text file (with DOS CRLF line terminations)
and translate from one database export format into another database
input format. I've pasted in my program and a short example file of data
at the end of this message. If the program is named 'medline2popline.pl'
and the data file is named 't.txt', the program is run with
'./medline2popline.pl t.txt'.

In the data file, fields start with 2-4 uppercase letters or spaces,
followed by a dash and a space, followed by the data. They end with
CRLF. Records are separated by two consecutive CRLF combinations.

I think my problem is caused by the DOS line terminations and the way
I'm trying to handle them in my overall program. My problem is lines
that look like this:
AD  - Department of Family and Community Medicine, College of Medicine,
King Faisal^M$
      University, Dammam, Saudi Arabia. [EMAIL PROTECTED]

(This should be just two lines; my email program is wrapping them.) I'm
trying to capture everything from the first 'AD  - ' to the next set of
four characters that are either upper-case letters or blanks, followed
by a dash and a blank. I tried to use this:
   my($ad) = /AD  - (.*?)\015\012([A-Z]|\s){4}-\s/;

This regex only matches the one address in my sample data that consists
of just one line. It fails to match anything for the multi-line
addresses. Any suggestions on how I could capture both (or however many
lines used) lines in $ad?

Thanks for your advice and help.

-Kevin

Kevin Zembower
Internet Services Group manager
Center for Communication Programs
Bloomberg School of Public Health
Johns Hopkins University
111 Market Place, Suite 310
Baltimore, Maryland  21202
410-659-6139 

============================
My complete program so far:
#! /usr/bin/perl -w
#
#medline2popline.pl written 7-Jan-2008 by Kevin Zembower
#
#   This program converts medline records into POPLINE InMagic
importable records.
#
#   It reads the medline records from STDIN and output POPLINE import
records to STDOUT


use strict;

$/="\015\012\015\012";  # Read a whole records (separated by a blank
line) at a time.
$\="\015\012";  # Output line termination is CRLF (for DOS)

while (<>) {
   chomp;
   #print "Record: $_";
   my($ad) = /AD  - (.*?)\015\012([A-Z]|\s){4}-\s/;
   my($au) = /AU  - (.*?)\015\012/;
   my($dp) = /DP  - (.*?)\015\012/;
   my($ip) = /IP  - (.*?)\015\012/;
   my($pg) = /PG  - (.*?)\015\012/;
   my($pl) = /PL  - (.*?)\015\012/;
   my($tt) = /TT  - (.*?)\015\012/;
   my($vi) = /VI  - (.*?)\015\012/;

   print "AuthorAddress $ad"    if ($ad);
   print "Author $au"           if ($au);
   print "DateofPub $dp"        if ($dp);
   print "Issue $ip"            if ($ip);
   print "Pagination $pg"       if ($pg);
   print "JournalCountry $pl"   if ($pl);
   print "TT $tt"               if ($tt);
   print "Volume $vi"           if ($vi);
   print "";    #Blank line between records
}# while there are more lines to input
==============================================
short data file (in cat -vet output format):
[EMAIL PROTECTED]:~/medline2popline$ cat -vet t.txt 
PMID- 17991957^M$
OWN - NLM^M$
STAT- MEDLINE^M$
DA  - 20071109^M$
DCOM- 20071126^M$
PUBM- Print^M$
IS  - 1468-5833 (Electronic)^M$
VI  - 335^M$
IP  - 7627^M$
DP  - 2007 Nov 10^M$
TI  - Faulty government condoms threaten South Africa's AIDS
programme.^M$
PG  - 957^M$
FAU - Moszynski, Peter^M$
AU  - Moszynski P^M$
LA  - eng^M$
PT  - News^M$
PL  - England^M$
TA  - BMJ^M$
JT  - BMJ (Clinical research ed.)^M$
JID - 8900488^M$
SB  - AIM^M$
SB  - IM^M$
MH  - Acquired Immunodeficiency Syndrome/*prevention & control^M$
MH  - Condoms/*standards^M$
MH  - Humans^M$
MH  - South Africa^M$
EDAT- 2007/11/10 09:00^M$
MHDA- 2007/12/06 09:00^M$
12/7/2007 1:36PM335/7627/957 [pii]^M$
AID - 10.1136/bmj.39388.651308.DB [doi]^M$
PST - ppublish^M$
SO  - BMJ. 2007 Nov 10;335(7627):957.^M$
^M$
PMID- 17983999^M$
OWN - NLM^M$
STAT- MEDLINE^M$
DA  - 20071106^M$
DCOM- 20071129^M$
PUBM- Print^M$
IS  - 1542-2011 (Electronic)^M$
VI  - 52^M$
IP  - 6^M$
DP  - 2007 Nov-Dec^M$
TI  - Contraception and lactation.^M$
PG  - 614-20^M$
AB  - The benefits of breastfeeding for both the infant and the mother
are undisputed. ^M$
      Longer intervals between births decrease fetal/infant and maternal
complications.^M$
      Lactation is an effective contraceptive for the first 6 months
postpartum only if^M$
      women breastfeed exclusively and at regular intervals, including
nighttime.^M$
      Because a high percentage of women in the United States supplement
breastfeeding,^M$
      it is important for these women to choose a method of
contraception to prevent^M$
      unintended pregnancies. Both the method of contraception and the
timing of the^M$
      initiation of contraceptives are important decisions that a
clinician must help^M$
      the breastfeeding woman make. Ideally, the chosen method of
contraception should ^M$
      not interfere with lactation. This article reviews the research on
the effect of ^M$
      contraceptives, including hormonal contraceptives, on
lactation.^M$
AD  - Emory University School of Nursing, Atlanta, GA 30322, USA.
[EMAIL PROTECTED]
FAU - King, Joyce^M$
AU  - King J^M$
LA  - eng^M$
LA  - fre^M$
PT  - Journal Article^M$
PT  - Review^M$
PL  - United States^M$
TA  - J Midwifery Womens Health^M$
JT  - Journal of midwifery & women's health^M$
JID - 100909407^M$
RN  - 0 (Contraceptive Agents, Female)^M$
SB  - IM^M$
SB  - N^M$
MH  - Contraception/*methods^M$
MH  - Contraception Behavior^M$
MH  - *Contraceptive Agents, Female^M$
MH  - *Contraceptive Devices, Female^M$
MH  - Evidence-Based Medicine^M$
MH  - Female^M$
MH  - Health Knowledge, Attitudes, Practice^M$
MH  - Humans^M$
MH  - Infant Welfare^M$
MH  - Infant, Newborn^M$
MH  - *Lactation^M$
MH  - Maternal Welfare^M$
MH  - Mothers/education^M$
MH  - *Patient Education as Topic^M$
MH  - *Postpartum Period^M$
RF  - 37^M$
EDAT- 2007/11/07 09:00^M$
MHDA- 2007/12/06 09:00^M$
AID - S1526-9523(07)00355-8 [pii]^M$
AID - 10.1016/j.jmwh.2007.08.012 [doi]^M$
PST - ppublish^M$
SO  - J Midwifery Womens Health. 2007 Nov-Dec;52(6):614-20.^M$
^M$
PMID- 17955772^M$
OWN - NLM^M$
STAT- MEDLINE^M$
DA  - 20071024^M$
DCOM- 20071120^M$
PUBM- Print^M$
IS  - 1020-3397 (Print)^M$
VI  - 13^M$
IP  - 4^M$
DP  - 2007 Jul-Aug^M$
TI  - Birth interval: perceptions and practices among urban-based Saudi
Arabian women.^M$
PG  - 881-92^M$
AB  - To determine perceptions towards birth spacing, actual birth
interval and^M$
      associated sociodemographic factors, we carried out a
cross-sectional study on^M$
      436 mothers aged 15-50 years in Al-Khobar. All had had > or = 2
children within^M$
      the previous 10 years. Only 5.2% preferred a birth interval of < 2
years, 28.2%^M$
      preferred a 2 -< 3-year interval, while the rest favoured > or = 3
years.^M$
      Education and employment status were predictors of birth spacing
preference.^M$
      About half were not aware of the physical benefits associated with
longer birth^M$
      interval. Only 26.3% had mean birth interval < 2 years. Age and
employment status^M$
      were significant positive predictors of longer birth interval.
Oral contraception^M$
      was the most popular method adopted for child spacing.^M$
AD  - Department of Family and Community Medicine, College of Medicine,
King Faisal^M$
      University, Dammam, Saudi Arabia. [EMAIL PROTECTED]
FAU - Rasheed, P^M$
AU  - Rasheed P^M$
FAU - Al-Dabal, B K^M$
AU  - Al-Dabal BK^M$
LA  - eng^M$
PT  - Journal Article^M$
PL  - Egypt^M$
TA  - East Mediterr Health J^M$
JT  - Eastern Mediterranean health journal = La revue de sante de la
Mediterranee^M$
      orientale = al-Majallah al-sihhiyah li-sharq al-mutawassit^M$
JID - 9608387^M$
SB  - IM^M$
MH  - Adolescent^M$
MH  - Adult^M$
MH  - *Attitude to Health^M$
MH  - *Birth Intervals/psychology/statistics & numerical data^M$
MH  - Choice Behavior^M$
MH  - Contraception/methods/psychology/statistics & numerical data^M$
MH  - Contraception Behavior/psychology/statistics & numerical data^M$
MH  - Cross-Sectional Studies^M$
MH  - Educational Status^M$
MH  - Family Characteristics^M$
MH  - Female^M$
MH  - *Health Knowledge, Attitudes, Practice^M$
MH  - Humans^M$
MH  - Intention^M$
MH  - Linear Models^M$
MH  - Maternal Age^M$
MH  - *Mothers/education/psychology/statistics & numerical data^M$
MH  - Multivariate Analysis^M$
MH  - Occupations/statistics & numerical data^M$
MH  - Questionnaires^M$
MH  - Saudi Arabia^M$
MH  - Socioeconomic Factors^M$
MH  - Urban Population^M$
EDAT- 2007/10/25 09:00^M$
MHDA- 2007/12/06 09:00^M$
PST - ppublish^M$
SO  - East Mediterr Health J. 2007 Jul-Aug;13(4):881-92.^M$
^M$
PMID- 17955618^M$
OWN - NLM^M$
STAT- MEDLINE^M$
DA  - 20071005^M$
DCOM- 20071113^M$
PUBM- Print^M$
IS  - 1181-7186 (Print)^M$
VI  - 19^M$
IP  - 5^M$
DP  - 2007 Jul-Aug^M$
TI  - Infections. Sex and hepatitis C infection in Germany.^M$
PG  - 9-10^M$
LA  - eng^M$
PT  - Newspaper Article^M$
PL  - Canada^M$
TA  - TreatmentUpdate^M$
JT  - TreatmentUpdate^M$
JID - 100891076^M$
SB  - X^M$
MH  - Adult^M$
MH  - Condoms^M$
MH  - Germany/epidemiology^M$
MH  - Hepatitis C/*epidemiology/prevention & control/transmission^M$
MH  - Humans^M$
MH  - Male^M$
MH  - Risk Factors^M$
MH  - *Sexual Behavior^M$
MH  - Substance-Related Disorders^M$
MH  - Surgical Procedures, Operative/adverse effects^M$
EDAT- 2007/10/25 09:00^M$
MHDA- 2007/11/14 09:00^M$
PST - ppublish^M$
SO  - TreatmentUpdate. 2007 Jul-Aug;19(5):9-10.^M$
^M$
PMID- 17952213^M$
OWN - NLM^M$
STAT- MEDLINE^M$
DA  - 20071022^M$
DCOM- 20071106^M$
LR  - 20071115^M$
PUBM- Print^M$
IS  - 0256-9574 (Print)^M$
VI  - 97^M$
IP  - 8^M$
DP  - 2007 Aug^M$
TI  - Emergency contraception--lack of awareness among women presenting
for termination^M$
      of pregnancy.^M$
PG  - 584-5^M$
FAU - Moodley, Jennifer^M$
AU  - Moodley J^M$
FAU - Morroni, Chelsea^M$
AU  - Morroni C^M$
LA  - eng^M$
PT  - Letter^M$
PL  - South Africa^M$
TA  - S Afr Med J^M$
JT  - South African medical journal = Suid-Afrikaanse tydskrif vir
geneeskunde^M$
JID - 0404520^M$
RN  - 0 (Contraceptives, Postcoital)^M$
SB  - IM^M$
MH  - *Abortion, Legal^M$
MH  - Adolescent^M$
MH  - Adult^M$
MH  - *Awareness^M$
MH  - Contraception/*methods^M$
MH  - Contraceptives, Postcoital/*therapeutic use^M$
MH  - *Emergencies^M$
MH  - Female^M$
MH  - *Health Knowledge, Attitudes, Practice^M$
MH  - Health Surveys^M$
MH  - Humans^M$
MH  - *Patient Education as Topic^M$
MH  - Pregnancy^M$
EDAT- 2007/10/24 09:00^M$
MHDA- 2007/11/07 09:00^M$
PST - ppublish^M$
SO  - S Afr Med J. 2007 Aug;97(8):584-5.^M$
^M$
PMID- 17949389^M$
OWN - NLM^M$
STAT- MEDLINE^M$
DA  - 20071022^M$
DCOM- 20071108^M$
PUBM- Print^M$
IS  - 1471-0528 (Electronic)^M$
VI  - 114^M$
IP  - 11^M$
DP  - 2007 Nov^M$
TI  - The feasibility, success and patient satisfaction associated with
outpatients^M$
      hysteroscopic sterilisation.^M$
PG  - 1449; author reply 1449-50^M$
FAU - Qureshi, N S^M$
AU  - Qureshi NS^M$
LA  - eng^M$
PT  - Comment^M$
PT  - Letter^M$
PL  - England^M$
TA  - BJOG^M$
JT  - BJOG : an international journal of obstetrics and gynaecology^M$
JID - 100935741^M$
SB  - AIM^M$
SB  - IM^M$
CON - BJOG. 2007 Jun;114(6):676-83. PMID: 17516957^M$
MH  - Ambulatory Surgical Procedures/*methods^M$
MH  - Catheter Ablation/methods^M$
MH  - Feasibility Studies^M$
MH  - Female^M$
MH  - Humans^M$
MH  - Magnetic Resonance Imaging^M$
MH  - Microwaves/therapeutic use^M$
MH  - *Patient Satisfaction^M$
MH  - Sterilization, Reproductive^M$
EDAT- 2007/10/24 09:00^M$
MHDA- 2007/11/09 09:00^M$
AID - BJO1494 [pii]^M$
AID - 10.1111/j.1471-0528.2007.01494.x [doi]^M$
PST - ppublish^M$
SO  - BJOG. 2007 Nov;114(11):1449; author reply 1449-50.^M$
^M$
PMID- 17948764^M$
OWN - NLM^M$
STAT- MEDLINE^M$
DA  - 20071022^M$
DCOM- 20071113^M$
PUBM- Print^M$
IS  - 0895-3988 (Print)^M$
VI  - 20^M$
IP  - 4^M$
DP  - 2007 Aug^M$
TI  - Power relation and condom use in commercial sex behaviors.^M$
PG  - 302-6^M$
AB  - OBJECTIVE: To explore whether condom use is influenced by power
relation in^M$
      commercial sex behaviors. METHODS: Variables were designed to
measure the power^M$
      relation in commercial sex behaviors based on the theory of gender
and power^M$
      relation and data were collected from male sexually transmitted
diseases (STD)^M$
      patients and female commercial sex workers (FSWs) working at
recreation centers^M$
      or being detained in a women education center to identify the
relationship^M$
      between condom use and power relation in male and female
respondents using^M$
      bivariate and multiple regression analysis. RESULTS: A significant
relationship^M$
      was identified between power relation and female condom use, the
higher the score^M$
      of power relations, the higher frequency the condom use, but no
similar result^M$
      was found in males. Females got a higher score of power relation
than males.^M$
      CONCLUSIONS: Power relation is one of the factors that influence
condom use,^M$
      which should be considered when relevant theories are used to
study the rate of^M$
      condom use. It is worthwhile exploring the relationship between
safe sex and^M$
      power relation in spouses and regular sex partners when
interventions are adopted^M$
      to prevent HIV/AIDS spreading from high risk groups to the general
population.^M$
AD  - Institute of Viral Disease Prevention and Control, Chinese Center
for Disease^M$
      Control and Prevention, Beijing 100050, China.^M$
FAU - Wang, Ying^M$
AU  - Wang Y^M$
FAU - Li, Bing^M$
AU  - Li B^M$
FAU - Song, Dong-Mei^M$
AU  - Song DM^M$
FAU - Ding, Guang-Yan^M$
AU  - Ding GY^M$
FAU - Cathy, Emric^M$
AU  - Cathy E^M$
LA  - eng^M$
PT  - Journal Article^M$
PL  - United States^M$
TA  - Biomed Environ Sci^M$
JT  - Biomedical and environmental sciences : BES^M$
JID - 8909524^M$
SB  - IM^M$
MH  - Adolescent^M$
MH  - Adult^M$
MH  - Condoms/*utilization^M$
MH  - Female^M$
MH  - Humans^M$
MH  - Male^M$
MH  - *Power (Psychology)^M$
MH  - *Sexual Behavior^M$
EDAT- 2007/10/24 09:00^M$
MHDA- 2007/11/14 09:00^M$
PST - ppublish^M$
SO  - Biomed Environ Sci. 2007 Aug;20(4):302-6.^M$
^M$
PMID- 17939181^M$
OWN - NLM^M$
STAT- MEDLINE^M$
DA  - 20071015^M$
DCOM- 20071030^M$
PUBM- Print^M$
IS  - 1473-3099 (Print)^M$
VI  - 7^M$
IP  - 10^M$
DP  - 2007 Oct^M$
TI  - Female-initiated prevention strategies key to tackling HIV.^M$
PG  - 637^M$
FAU - Nelson, Roxanne^M$
AU  - Nelson R^M$
LA  - eng^M$
PT  - News^M$
PL  - United States^M$
TA  - Lancet Infect Dis^M$
JT  - The Lancet infectious diseases^M$
JID - 101130150^M$
RN  - 0 (Anti-HIV Agents)^M$
RN  - 0 (Phosphonic Acids)^M$
RN  - 0 (Reverse Transcriptase Inhibitors)^M$
RN  - 107021-12-5 (tenofovir)^M$
RN  - 73-24-5 (Adenine)^M$
SB  - IM^M$
MH  - Adenine/analogs & derivatives/therapeutic use^M$
MH  - Africa South of the Sahara^M$
MH  - Anti-HIV Agents/therapeutic use^M$
MH  - Condoms, Female/utilization^M$
MH  - Diaphragm^M$
MH  - Female^M$
MH  - HIV Infections/*prevention & control^M$
MH  - Humans^M$
MH  - Phosphonic Acids/therapeutic use^M$
MH  - Primary Prevention/*methods^M$
MH  - Reverse Transcriptase Inhibitors/therapeutic use^M$
MH  - Rwanda^M$
MH  - Women/*education/*psychology^M$
EDAT- 2007/10/17 09:00^M$
MHDA- 2007/10/31 09:00^M$
PST - ppublish^M$
SO  - Lancet Infect Dis. 2007 Oct;7(10):637.^M$
^M$
PMID- 17933635^M$
OWN - NLM^M$
STAT- MEDLINE^M$
DA  - 20071015^M$
DCOM- 20071026^M$
PUBM- Print^M$
IS  - 1474-547X (Electronic)^M$
VI  - 370^M$
IP  - 9595^M$
DP  - 2007 Oct 13^M$
TI  - Contraception, safe abortion, and maternal morbidity.^M$
PG  - 1294-5^M$
AD  - Department of Population, Family and Reproductive Health, Johns
Hopkins Bloomberg^M$
      School of Public Health, Baltimore, MD 21205, USA.
[EMAIL PROTECTED]
FAU - Hindin, Michelle J^M$
AU  - Hindin MJ^M$
LA  - eng^M$
PT  - Comment^M$
PT  - Journal Article^M$
PL  - England^M$
TA  - Lancet^M$
JT  - Lancet^M$
JID - 2985213R^M$
SB  - AIM^M$
SB  - IM^M$
CON - Lancet. 2007 Oct 13;370(9595):1329-37. PMID: 17933647^M$
MH  - Abortion, Induced/*statistics & numerical data/trends^M$
MH  - Burkina Faso/epidemiology^M$
MH  - Contraception/economics/*utilization^M$
MH  - Female^M$
MH  - Humans^M$
MH  - *Maternal Mortality^M$
MH  - Obstetric Labor Complications/*epidemiology/mortality/prevention &
control^M$
MH  - Pregnancy^M$
EDAT- 2007/10/16 09:00^M$
MHDA- 2007/10/30 09:00^M$
AID - S0140-6736(07)61555-4 [pii]^M$
AID - 10.1016/S0140-6736(07)61555-4 [doi]^M$
PST - ppublish^M$
SO  - Lancet. 2007 Oct 13;370(9595):1294-5.^M$
^M$
PMID- 17933295^M$
OWN - NLM^M$
STAT- MEDLINE^M$
DA  - 20071015^M$
DCOM- 20071120^M$
PUBM- Print^M$
IS  - 0039-3665 (Print)^M$
VI  - 38^M$
IP  - 3^M$
DP  - 2007 Sep^M$
TI  - Senegal 2005: results from the demographic and health survey.^M$
PG  - 212-7^M$
LA  - eng^M$
PT  - Journal Article^M$
PL  - United States^M$
TA  - Stud Fam Plann^M$
JT  - Studies in family planning^M$
JID - 7810364^M$
SB  - IM^M$
MH  - Adolescent^M$
MH  - Adult^M$
MH  - Birth Rate/trends^M$
MH  - Breast Feeding/statistics & numerical data^M$
MH  - Contraception/statistics & numerical data/utilization^M$
MH  - *Demography^M$
MH  - Female^M$
MH  - Health Knowledge, Attitudes, Practice^M$
MH  - Health Status^M$
MH  - *Health Surveys^M$
MH  - Humans^M$
MH  - Infant^M$
MH  - Infant Mortality/trends^M$
MH  - Infant, Newborn^M$
MH  - Middle Aged^M$
MH  - Senegal/epidemiology^M$
EDAT- 2007/10/16 09:00^M$
MHDA- 2007/12/06 09:00^M$
PST - ppublish^M$
SO  - Stud Fam Plann. 2007 Sep;38(3):212-7.^M$
^M$
PMID- 17933294^M$
OWN - NLM^M$
STAT- MEDLINE^M$
DA  - 20071015^M$
DCOM- 20071120^M$
PUBM- Print^M$
IS  - 0039-3665 (Print)^M$
VI  - 38^M$
IP  - 3^M$
DP  - 2007 Sep^M$
TI  - Guinea 2005: results from the demographic and health survey.^M$
PG  - 206-11^M$
LA  - eng^M$
PT  - Journal Article^M$
PL  - United States^M$
TA  - Stud Fam Plann^M$
JT  - Studies in family planning^M$
JID - 7810364^M$
SB  - IM^M$
MH  - Adolescent^M$
MH  - Adult^M$
MH  - Birth Rate/trends^M$
MH  - Breast Feeding/statistics & numerical data^M$
MH  - Contraception/statistics & numerical data/utilization^M$
MH  - *Demography^M$
MH  - Female^M$
MH  - Guinea/epidemiology^M$
MH  - Health Knowledge, Attitudes, Practice^M$
MH  - Health Status^M$
MH  - *Health Surveys^M$
MH  - Humans^M$
MH  - Infant^M$
MH  - Infant Mortality/trends^M$
MH  - Infant, Newborn^M$
MH  - Middle Aged^M$
EDAT- 2007/10/16 09:00^M$
MHDA- 2007/12/06 09:00^M$
PST - ppublish^M$
SO  - Stud Fam Plann. 2007 Sep;38(3):206-11.^M$
^M$
[EMAIL PROTECTED]:~/medline2popline$
====================================

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to