Hi Rich, I can confirm that Perl Maven is a well written and up-to-date resource for learning Perl, and the article https://perlmaven.com/how-to-read-a-csv-file-using-perl is a good path to understanding the best approach to processing CSV. Definitely read it through to the end.
With regard to the kinds of queries you'll be working on, here's an explanation of the best approach to turning the Timestamp column into DateTime objects https://chatgpt.com/share/69793681-6538-800c-a726-bb59b1cca6b1 and with DateTime objects you can order the rows https://metacpan.org/pod/DateTime#DateTime-%3Ecompare(-$dt1,-$dt2-),-DateTime-%3Ecompare_ignore_floating(-$dt1,-$dt2-) I hope you find these resources helpful. Kind regards, Andrew On Tue, Jan 27, 2026 at 9:52 PM Gomes, Rich via beginners < [email protected]> wrote: > I am working on a ongoing project with email where I will need to import > daily csv files into perl and create a searchable database of all the files. > > Here is some example data: > > Header: > (Some of these fields\columns may or may not be removed in future csv’s, > but this is what we have for now) > > Timestamp,SenderFromDomain,SenderFromAddress,DMARC,RecipientEmailAddress,Subject,SenderIPv4,Connectors,DeliveryAction,EmailActionPolicy,OrgLevelAction,OrgLevelPolicy,UserLevelAction,UserLevelPolicy,AuthenticationDetails,Context,ReportId,SenderObjectId > > > Example rows: > > "Jan 27, 2026 3:30:56 PM",domain.com,[email protected],pass, > [email protected],Thank you for your > application,20.1.130.13,,Delivered,,Allow,Connection > policy,,,"{""SPF"":""pass"",""DKIM"":""pass"",""DMARC"":""pass"",""CompAuth"":""pass""}",,4647d63d-1f9d-4982-6c39-08de5de2f778-18193297287602271192-1,1d3478ee-351f-4ee9-b6ec-7b03ee68e334 > > "Jan 27, 2026 3:33:04 PM", domain.ar,notifica@ domain.ar,pass, > [email protected],Envío de Orden de Compra Aramark Nro. > 115615,149.72.150.13,,Delivered,,,,,,"{""SPF"":""pass"",""DKIM"":""pass"",""DMARC"":""pass"",""CompAuth"":""pass""}",,976717e0-23ac-4538-a058-08de5de33a88-6451908357547151849-1, > > "Jan 27, 2026 3:31:29 PM", domain.com,paradox@ domain.com,pass, > [email protected],Please confirm your interview with HR > Reps,159.183.2.108,,Delivered,,,,,,"{""SPF"":""pass"",""DKIM"":""pass"",""DMARC"":""pass"",""CompAuth"":""pass""}",,f8d7f41e-fb08-491c-43f4-08de5de30c16-11061410221252786783-1,5767d814-45d6-4a03-bb3b-434692b8edc3 > > > > > My initial question is: > > Since the data will stay for some time (at least a year), is a database > the best to import the data “into”? Or would an array be a better approach? > > Some of the queries I expect to perfom are: > > “Show me the last time that a specific value in SenderFromAddress had a > Connector value of “empty”” > > “Show me the last time that SenderFromAddress had a OrgLevelPolicy value > of “xyz”” > > Things like that. Basically query any combinations of fields > > > > Also, since all the files are in the same format, how do you “ignore” the > header after the “first import”? > > > Also, there is a potential for some overlap in data, albeit small (I am > pulling this data from a KQL query in O365), is there a “routine” I can run > against the data to detect and remove any duplicate data. > I would like to learn how to do this both during the import and also run > it against existing data. That may seem “extra” but this is all about me > learning how to do each of these things > > Is this a good starting place for what I am looking to do:? > > *How to read a CSV file using Perl?* > <https://perlmaven.com/how-to-read-a-csv-file-using-perl> > > > Thank you, > > Rich > > > > >
