I'm trying to process a DOS text file (with DOS CRLF line terminations) and translate from one database export format into another database input format. I've pasted in my program and a short example file of data at the end of this message. If the program is named 'medline2popline.pl' and the data file is named 't.txt', the program is run with './medline2popline.pl t.txt'.
In the data file, fields start with 2-4 uppercase letters or spaces, followed by a dash and a space, followed by the data. They end with CRLF. Records are separated by two consecutive CRLF combinations. I think my problem is caused by the DOS line terminations and the way I'm trying to handle them in my overall program. My problem is lines that look like this: AD - Department of Family and Community Medicine, College of Medicine, King Faisal^M$ University, Dammam, Saudi Arabia. [EMAIL PROTECTED] (This should be just two lines; my email program is wrapping them.) I'm trying to capture everything from the first 'AD - ' to the next set of four characters that are either upper-case letters or blanks, followed by a dash and a blank. I tried to use this: my($ad) = /AD - (.*?)\015\012([A-Z]|\s){4}-\s/; This regex only matches the one address in my sample data that consists of just one line. It fails to match anything for the multi-line addresses. Any suggestions on how I could capture both (or however many lines used) lines in $ad? Thanks for your advice and help. -Kevin Kevin Zembower Internet Services Group manager Center for Communication Programs Bloomberg School of Public Health Johns Hopkins University 111 Market Place, Suite 310 Baltimore, Maryland 21202 410-659-6139 ============================ My complete program so far: #! /usr/bin/perl -w # #medline2popline.pl written 7-Jan-2008 by Kevin Zembower # # This program converts medline records into POPLINE InMagic importable records. # # It reads the medline records from STDIN and output POPLINE import records to STDOUT use strict; $/="\015\012\015\012"; # Read a whole records (separated by a blank line) at a time. $\="\015\012"; # Output line termination is CRLF (for DOS) while (<>) { chomp; #print "Record: $_"; my($ad) = /AD - (.*?)\015\012([A-Z]|\s){4}-\s/; my($au) = /AU - (.*?)\015\012/; my($dp) = /DP - (.*?)\015\012/; my($ip) = /IP - (.*?)\015\012/; my($pg) = /PG - (.*?)\015\012/; my($pl) = /PL - (.*?)\015\012/; my($tt) = /TT - (.*?)\015\012/; my($vi) = /VI - (.*?)\015\012/; print "AuthorAddress $ad" if ($ad); print "Author $au" if ($au); print "DateofPub $dp" if ($dp); print "Issue $ip" if ($ip); print "Pagination $pg" if ($pg); print "JournalCountry $pl" if ($pl); print "TT $tt" if ($tt); print "Volume $vi" if ($vi); print ""; #Blank line between records }# while there are more lines to input ============================================== short data file (in cat -vet output format): [EMAIL PROTECTED]:~/medline2popline$ cat -vet t.txt PMID- 17991957^M$ OWN - NLM^M$ STAT- MEDLINE^M$ DA - 20071109^M$ DCOM- 20071126^M$ PUBM- Print^M$ IS - 1468-5833 (Electronic)^M$ VI - 335^M$ IP - 7627^M$ DP - 2007 Nov 10^M$ TI - Faulty government condoms threaten South Africa's AIDS programme.^M$ PG - 957^M$ FAU - Moszynski, Peter^M$ AU - Moszynski P^M$ LA - eng^M$ PT - News^M$ PL - England^M$ TA - BMJ^M$ JT - BMJ (Clinical research ed.)^M$ JID - 8900488^M$ SB - AIM^M$ SB - IM^M$ MH - Acquired Immunodeficiency Syndrome/*prevention & control^M$ MH - Condoms/*standards^M$ MH - Humans^M$ MH - South Africa^M$ EDAT- 2007/11/10 09:00^M$ MHDA- 2007/12/06 09:00^M$ 12/7/2007 1:36PM335/7627/957 [pii]^M$ AID - 10.1136/bmj.39388.651308.DB [doi]^M$ PST - ppublish^M$ SO - BMJ. 2007 Nov 10;335(7627):957.^M$ ^M$ PMID- 17983999^M$ OWN - NLM^M$ STAT- MEDLINE^M$ DA - 20071106^M$ DCOM- 20071129^M$ PUBM- Print^M$ IS - 1542-2011 (Electronic)^M$ VI - 52^M$ IP - 6^M$ DP - 2007 Nov-Dec^M$ TI - Contraception and lactation.^M$ PG - 614-20^M$ AB - The benefits of breastfeeding for both the infant and the mother are undisputed. ^M$ Longer intervals between births decrease fetal/infant and maternal complications.^M$ Lactation is an effective contraceptive for the first 6 months postpartum only if^M$ women breastfeed exclusively and at regular intervals, including nighttime.^M$ Because a high percentage of women in the United States supplement breastfeeding,^M$ it is important for these women to choose a method of contraception to prevent^M$ unintended pregnancies. Both the method of contraception and the timing of the^M$ initiation of contraceptives are important decisions that a clinician must help^M$ the breastfeeding woman make. Ideally, the chosen method of contraception should ^M$ not interfere with lactation. This article reviews the research on the effect of ^M$ contraceptives, including hormonal contraceptives, on lactation.^M$ AD - Emory University School of Nursing, Atlanta, GA 30322, USA. [EMAIL PROTECTED] FAU - King, Joyce^M$ AU - King J^M$ LA - eng^M$ LA - fre^M$ PT - Journal Article^M$ PT - Review^M$ PL - United States^M$ TA - J Midwifery Womens Health^M$ JT - Journal of midwifery & women's health^M$ JID - 100909407^M$ RN - 0 (Contraceptive Agents, Female)^M$ SB - IM^M$ SB - N^M$ MH - Contraception/*methods^M$ MH - Contraception Behavior^M$ MH - *Contraceptive Agents, Female^M$ MH - *Contraceptive Devices, Female^M$ MH - Evidence-Based Medicine^M$ MH - Female^M$ MH - Health Knowledge, Attitudes, Practice^M$ MH - Humans^M$ MH - Infant Welfare^M$ MH - Infant, Newborn^M$ MH - *Lactation^M$ MH - Maternal Welfare^M$ MH - Mothers/education^M$ MH - *Patient Education as Topic^M$ MH - *Postpartum Period^M$ RF - 37^M$ EDAT- 2007/11/07 09:00^M$ MHDA- 2007/12/06 09:00^M$ AID - S1526-9523(07)00355-8 [pii]^M$ AID - 10.1016/j.jmwh.2007.08.012 [doi]^M$ PST - ppublish^M$ SO - J Midwifery Womens Health. 2007 Nov-Dec;52(6):614-20.^M$ ^M$ PMID- 17955772^M$ OWN - NLM^M$ STAT- MEDLINE^M$ DA - 20071024^M$ DCOM- 20071120^M$ PUBM- Print^M$ IS - 1020-3397 (Print)^M$ VI - 13^M$ IP - 4^M$ DP - 2007 Jul-Aug^M$ TI - Birth interval: perceptions and practices among urban-based Saudi Arabian women.^M$ PG - 881-92^M$ AB - To determine perceptions towards birth spacing, actual birth interval and^M$ associated sociodemographic factors, we carried out a cross-sectional study on^M$ 436 mothers aged 15-50 years in Al-Khobar. All had had > or = 2 children within^M$ the previous 10 years. Only 5.2% preferred a birth interval of < 2 years, 28.2%^M$ preferred a 2 -< 3-year interval, while the rest favoured > or = 3 years.^M$ Education and employment status were predictors of birth spacing preference.^M$ About half were not aware of the physical benefits associated with longer birth^M$ interval. Only 26.3% had mean birth interval < 2 years. Age and employment status^M$ were significant positive predictors of longer birth interval. Oral contraception^M$ was the most popular method adopted for child spacing.^M$ AD - Department of Family and Community Medicine, College of Medicine, King Faisal^M$ University, Dammam, Saudi Arabia. [EMAIL PROTECTED] FAU - Rasheed, P^M$ AU - Rasheed P^M$ FAU - Al-Dabal, B K^M$ AU - Al-Dabal BK^M$ LA - eng^M$ PT - Journal Article^M$ PL - Egypt^M$ TA - East Mediterr Health J^M$ JT - Eastern Mediterranean health journal = La revue de sante de la Mediterranee^M$ orientale = al-Majallah al-sihhiyah li-sharq al-mutawassit^M$ JID - 9608387^M$ SB - IM^M$ MH - Adolescent^M$ MH - Adult^M$ MH - *Attitude to Health^M$ MH - *Birth Intervals/psychology/statistics & numerical data^M$ MH - Choice Behavior^M$ MH - Contraception/methods/psychology/statistics & numerical data^M$ MH - Contraception Behavior/psychology/statistics & numerical data^M$ MH - Cross-Sectional Studies^M$ MH - Educational Status^M$ MH - Family Characteristics^M$ MH - Female^M$ MH - *Health Knowledge, Attitudes, Practice^M$ MH - Humans^M$ MH - Intention^M$ MH - Linear Models^M$ MH - Maternal Age^M$ MH - *Mothers/education/psychology/statistics & numerical data^M$ MH - Multivariate Analysis^M$ MH - Occupations/statistics & numerical data^M$ MH - Questionnaires^M$ MH - Saudi Arabia^M$ MH - Socioeconomic Factors^M$ MH - Urban Population^M$ EDAT- 2007/10/25 09:00^M$ MHDA- 2007/12/06 09:00^M$ PST - ppublish^M$ SO - East Mediterr Health J. 2007 Jul-Aug;13(4):881-92.^M$ ^M$ PMID- 17955618^M$ OWN - NLM^M$ STAT- MEDLINE^M$ DA - 20071005^M$ DCOM- 20071113^M$ PUBM- Print^M$ IS - 1181-7186 (Print)^M$ VI - 19^M$ IP - 5^M$ DP - 2007 Jul-Aug^M$ TI - Infections. Sex and hepatitis C infection in Germany.^M$ PG - 9-10^M$ LA - eng^M$ PT - Newspaper Article^M$ PL - Canada^M$ TA - TreatmentUpdate^M$ JT - TreatmentUpdate^M$ JID - 100891076^M$ SB - X^M$ MH - Adult^M$ MH - Condoms^M$ MH - Germany/epidemiology^M$ MH - Hepatitis C/*epidemiology/prevention & control/transmission^M$ MH - Humans^M$ MH - Male^M$ MH - Risk Factors^M$ MH - *Sexual Behavior^M$ MH - Substance-Related Disorders^M$ MH - Surgical Procedures, Operative/adverse effects^M$ EDAT- 2007/10/25 09:00^M$ MHDA- 2007/11/14 09:00^M$ PST - ppublish^M$ SO - TreatmentUpdate. 2007 Jul-Aug;19(5):9-10.^M$ ^M$ PMID- 17952213^M$ OWN - NLM^M$ STAT- MEDLINE^M$ DA - 20071022^M$ DCOM- 20071106^M$ LR - 20071115^M$ PUBM- Print^M$ IS - 0256-9574 (Print)^M$ VI - 97^M$ IP - 8^M$ DP - 2007 Aug^M$ TI - Emergency contraception--lack of awareness among women presenting for termination^M$ of pregnancy.^M$ PG - 584-5^M$ FAU - Moodley, Jennifer^M$ AU - Moodley J^M$ FAU - Morroni, Chelsea^M$ AU - Morroni C^M$ LA - eng^M$ PT - Letter^M$ PL - South Africa^M$ TA - S Afr Med J^M$ JT - South African medical journal = Suid-Afrikaanse tydskrif vir geneeskunde^M$ JID - 0404520^M$ RN - 0 (Contraceptives, Postcoital)^M$ SB - IM^M$ MH - *Abortion, Legal^M$ MH - Adolescent^M$ MH - Adult^M$ MH - *Awareness^M$ MH - Contraception/*methods^M$ MH - Contraceptives, Postcoital/*therapeutic use^M$ MH - *Emergencies^M$ MH - Female^M$ MH - *Health Knowledge, Attitudes, Practice^M$ MH - Health Surveys^M$ MH - Humans^M$ MH - *Patient Education as Topic^M$ MH - Pregnancy^M$ EDAT- 2007/10/24 09:00^M$ MHDA- 2007/11/07 09:00^M$ PST - ppublish^M$ SO - S Afr Med J. 2007 Aug;97(8):584-5.^M$ ^M$ PMID- 17949389^M$ OWN - NLM^M$ STAT- MEDLINE^M$ DA - 20071022^M$ DCOM- 20071108^M$ PUBM- Print^M$ IS - 1471-0528 (Electronic)^M$ VI - 114^M$ IP - 11^M$ DP - 2007 Nov^M$ TI - The feasibility, success and patient satisfaction associated with outpatients^M$ hysteroscopic sterilisation.^M$ PG - 1449; author reply 1449-50^M$ FAU - Qureshi, N S^M$ AU - Qureshi NS^M$ LA - eng^M$ PT - Comment^M$ PT - Letter^M$ PL - England^M$ TA - BJOG^M$ JT - BJOG : an international journal of obstetrics and gynaecology^M$ JID - 100935741^M$ SB - AIM^M$ SB - IM^M$ CON - BJOG. 2007 Jun;114(6):676-83. PMID: 17516957^M$ MH - Ambulatory Surgical Procedures/*methods^M$ MH - Catheter Ablation/methods^M$ MH - Feasibility Studies^M$ MH - Female^M$ MH - Humans^M$ MH - Magnetic Resonance Imaging^M$ MH - Microwaves/therapeutic use^M$ MH - *Patient Satisfaction^M$ MH - Sterilization, Reproductive^M$ EDAT- 2007/10/24 09:00^M$ MHDA- 2007/11/09 09:00^M$ AID - BJO1494 [pii]^M$ AID - 10.1111/j.1471-0528.2007.01494.x [doi]^M$ PST - ppublish^M$ SO - BJOG. 2007 Nov;114(11):1449; author reply 1449-50.^M$ ^M$ PMID- 17948764^M$ OWN - NLM^M$ STAT- MEDLINE^M$ DA - 20071022^M$ DCOM- 20071113^M$ PUBM- Print^M$ IS - 0895-3988 (Print)^M$ VI - 20^M$ IP - 4^M$ DP - 2007 Aug^M$ TI - Power relation and condom use in commercial sex behaviors.^M$ PG - 302-6^M$ AB - OBJECTIVE: To explore whether condom use is influenced by power relation in^M$ commercial sex behaviors. METHODS: Variables were designed to measure the power^M$ relation in commercial sex behaviors based on the theory of gender and power^M$ relation and data were collected from male sexually transmitted diseases (STD)^M$ patients and female commercial sex workers (FSWs) working at recreation centers^M$ or being detained in a women education center to identify the relationship^M$ between condom use and power relation in male and female respondents using^M$ bivariate and multiple regression analysis. RESULTS: A significant relationship^M$ was identified between power relation and female condom use, the higher the score^M$ of power relations, the higher frequency the condom use, but no similar result^M$ was found in males. Females got a higher score of power relation than males.^M$ CONCLUSIONS: Power relation is one of the factors that influence condom use,^M$ which should be considered when relevant theories are used to study the rate of^M$ condom use. It is worthwhile exploring the relationship between safe sex and^M$ power relation in spouses and regular sex partners when interventions are adopted^M$ to prevent HIV/AIDS spreading from high risk groups to the general population.^M$ AD - Institute of Viral Disease Prevention and Control, Chinese Center for Disease^M$ Control and Prevention, Beijing 100050, China.^M$ FAU - Wang, Ying^M$ AU - Wang Y^M$ FAU - Li, Bing^M$ AU - Li B^M$ FAU - Song, Dong-Mei^M$ AU - Song DM^M$ FAU - Ding, Guang-Yan^M$ AU - Ding GY^M$ FAU - Cathy, Emric^M$ AU - Cathy E^M$ LA - eng^M$ PT - Journal Article^M$ PL - United States^M$ TA - Biomed Environ Sci^M$ JT - Biomedical and environmental sciences : BES^M$ JID - 8909524^M$ SB - IM^M$ MH - Adolescent^M$ MH - Adult^M$ MH - Condoms/*utilization^M$ MH - Female^M$ MH - Humans^M$ MH - Male^M$ MH - *Power (Psychology)^M$ MH - *Sexual Behavior^M$ EDAT- 2007/10/24 09:00^M$ MHDA- 2007/11/14 09:00^M$ PST - ppublish^M$ SO - Biomed Environ Sci. 2007 Aug;20(4):302-6.^M$ ^M$ PMID- 17939181^M$ OWN - NLM^M$ STAT- MEDLINE^M$ DA - 20071015^M$ DCOM- 20071030^M$ PUBM- Print^M$ IS - 1473-3099 (Print)^M$ VI - 7^M$ IP - 10^M$ DP - 2007 Oct^M$ TI - Female-initiated prevention strategies key to tackling HIV.^M$ PG - 637^M$ FAU - Nelson, Roxanne^M$ AU - Nelson R^M$ LA - eng^M$ PT - News^M$ PL - United States^M$ TA - Lancet Infect Dis^M$ JT - The Lancet infectious diseases^M$ JID - 101130150^M$ RN - 0 (Anti-HIV Agents)^M$ RN - 0 (Phosphonic Acids)^M$ RN - 0 (Reverse Transcriptase Inhibitors)^M$ RN - 107021-12-5 (tenofovir)^M$ RN - 73-24-5 (Adenine)^M$ SB - IM^M$ MH - Adenine/analogs & derivatives/therapeutic use^M$ MH - Africa South of the Sahara^M$ MH - Anti-HIV Agents/therapeutic use^M$ MH - Condoms, Female/utilization^M$ MH - Diaphragm^M$ MH - Female^M$ MH - HIV Infections/*prevention & control^M$ MH - Humans^M$ MH - Phosphonic Acids/therapeutic use^M$ MH - Primary Prevention/*methods^M$ MH - Reverse Transcriptase Inhibitors/therapeutic use^M$ MH - Rwanda^M$ MH - Women/*education/*psychology^M$ EDAT- 2007/10/17 09:00^M$ MHDA- 2007/10/31 09:00^M$ PST - ppublish^M$ SO - Lancet Infect Dis. 2007 Oct;7(10):637.^M$ ^M$ PMID- 17933635^M$ OWN - NLM^M$ STAT- MEDLINE^M$ DA - 20071015^M$ DCOM- 20071026^M$ PUBM- Print^M$ IS - 1474-547X (Electronic)^M$ VI - 370^M$ IP - 9595^M$ DP - 2007 Oct 13^M$ TI - Contraception, safe abortion, and maternal morbidity.^M$ PG - 1294-5^M$ AD - Department of Population, Family and Reproductive Health, Johns Hopkins Bloomberg^M$ School of Public Health, Baltimore, MD 21205, USA. [EMAIL PROTECTED] FAU - Hindin, Michelle J^M$ AU - Hindin MJ^M$ LA - eng^M$ PT - Comment^M$ PT - Journal Article^M$ PL - England^M$ TA - Lancet^M$ JT - Lancet^M$ JID - 2985213R^M$ SB - AIM^M$ SB - IM^M$ CON - Lancet. 2007 Oct 13;370(9595):1329-37. PMID: 17933647^M$ MH - Abortion, Induced/*statistics & numerical data/trends^M$ MH - Burkina Faso/epidemiology^M$ MH - Contraception/economics/*utilization^M$ MH - Female^M$ MH - Humans^M$ MH - *Maternal Mortality^M$ MH - Obstetric Labor Complications/*epidemiology/mortality/prevention & control^M$ MH - Pregnancy^M$ EDAT- 2007/10/16 09:00^M$ MHDA- 2007/10/30 09:00^M$ AID - S0140-6736(07)61555-4 [pii]^M$ AID - 10.1016/S0140-6736(07)61555-4 [doi]^M$ PST - ppublish^M$ SO - Lancet. 2007 Oct 13;370(9595):1294-5.^M$ ^M$ PMID- 17933295^M$ OWN - NLM^M$ STAT- MEDLINE^M$ DA - 20071015^M$ DCOM- 20071120^M$ PUBM- Print^M$ IS - 0039-3665 (Print)^M$ VI - 38^M$ IP - 3^M$ DP - 2007 Sep^M$ TI - Senegal 2005: results from the demographic and health survey.^M$ PG - 212-7^M$ LA - eng^M$ PT - Journal Article^M$ PL - United States^M$ TA - Stud Fam Plann^M$ JT - Studies in family planning^M$ JID - 7810364^M$ SB - IM^M$ MH - Adolescent^M$ MH - Adult^M$ MH - Birth Rate/trends^M$ MH - Breast Feeding/statistics & numerical data^M$ MH - Contraception/statistics & numerical data/utilization^M$ MH - *Demography^M$ MH - Female^M$ MH - Health Knowledge, Attitudes, Practice^M$ MH - Health Status^M$ MH - *Health Surveys^M$ MH - Humans^M$ MH - Infant^M$ MH - Infant Mortality/trends^M$ MH - Infant, Newborn^M$ MH - Middle Aged^M$ MH - Senegal/epidemiology^M$ EDAT- 2007/10/16 09:00^M$ MHDA- 2007/12/06 09:00^M$ PST - ppublish^M$ SO - Stud Fam Plann. 2007 Sep;38(3):212-7.^M$ ^M$ PMID- 17933294^M$ OWN - NLM^M$ STAT- MEDLINE^M$ DA - 20071015^M$ DCOM- 20071120^M$ PUBM- Print^M$ IS - 0039-3665 (Print)^M$ VI - 38^M$ IP - 3^M$ DP - 2007 Sep^M$ TI - Guinea 2005: results from the demographic and health survey.^M$ PG - 206-11^M$ LA - eng^M$ PT - Journal Article^M$ PL - United States^M$ TA - Stud Fam Plann^M$ JT - Studies in family planning^M$ JID - 7810364^M$ SB - IM^M$ MH - Adolescent^M$ MH - Adult^M$ MH - Birth Rate/trends^M$ MH - Breast Feeding/statistics & numerical data^M$ MH - Contraception/statistics & numerical data/utilization^M$ MH - *Demography^M$ MH - Female^M$ MH - Guinea/epidemiology^M$ MH - Health Knowledge, Attitudes, Practice^M$ MH - Health Status^M$ MH - *Health Surveys^M$ MH - Humans^M$ MH - Infant^M$ MH - Infant Mortality/trends^M$ MH - Infant, Newborn^M$ MH - Middle Aged^M$ EDAT- 2007/10/16 09:00^M$ MHDA- 2007/12/06 09:00^M$ PST - ppublish^M$ SO - Stud Fam Plann. 2007 Sep;38(3):206-11.^M$ ^M$ [EMAIL PROTECTED]:~/medline2popline$ ==================================== -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/