With following script I get week number 44 and year 118, which is strange result. Week should be 1 and year 2019 for date 2018-31-12. What is wrong here?
Tom from datetime import datetime, timedelta, date flowFile = session.get() if (flowFile != None): file_name = flowFile.getAttribute('filename') date_file = file_name.split("_")[6] date_final = date_file.split(".")[0] date_obj = datetime.strptime(date_final,'%y%m%d') date_year = date_obj.year date_day = date_obj.day date_month = date_obj.month week_att = date(year=date_year, month=date_month, day=date_day).isocalendar()[1] year_att = date(year=date_year, month=date_month, day=date_day).isocalendar()[0] str_week = str(week_att) str_year = str(year_att) flowFile = session.putAttribute(flowFile, "year_extracted", str_year) flowFile = session.putAttribute(flowFile, "week_extracted", str_week) session.transfer(flowFile, REL_SUCCESS) session.commit() On Tue, 29 Jan 2019 at 16:59, Tomislav Novosel <to.novo...@gmail.com> wrote: > Thank you all for answers. The reason why I want this to do with python > script is wrong calculation of week number from date. Nifi has that > function in expression lang. (extracted_date:format("w", <<time_zone>>)). > My time zone is GMT+2. > If i set date, for example 20180819, and time zone in function GMT I get > week number 34, which is wrong. If I ommit time zone, I get week number 33, > which is right. I'm not sure if thats bug. You can test it for yourself, > and if you do, please share your findings here, maybe I'm doing something > wrong. > > On the other side, if I use python, I'more sure that I will get correct > week number, even for dates which overlaps with week number in next > year(e.g. 20181231) > > Since this calc will be in production, I need resilient workflow in the > future without errors. > > Regarding script I sent above, I'm getting error: "week cannot bo coerced > as string". I checked right on the beginning if the session is null or not. > > On Tue, 29 Jan 2019, 16:26 Jerry Vinokurov <grapesmo...@gmail.com wrote: > >> I wanted to add, since I've done this specific operation many times, that >> you can really just do this via the NiFi expression language, which I think >> is more "idiomatic" than having ExecuteScript processors all over the >> place. Basically, you would have an UpdateAttribute that set something >> called, say, date_extracted with an expression that looks something like >> ${filename:substringAfterLast('_'):toDate('yyyy.MM.dd')} (this is an >> approximation based on the above, modify as necessary for your purpose). >> Then you could use a second UpdateAttribute to extract various information >> from this date with the format command, e.g. ${date_extracted:format('<your >> format expression here>')}. I don't think there's one for "week" but in >> general this is the approach I take when I need to do date munging. >> >> On Tue, Jan 29, 2019 at 10:06 AM Tomislav Novosel <to.novo...@gmail.com> >> wrote: >> >>> Hi Matt, thanks for suggestions. But performance is not crucial here. >>> This is code i tried. but I get error: "AttributeError: 'NoneType' >>> object has no attribute 'getAttribute' at line number 4" >>> If I remove code from line 6 to line 14, it works with some default >>> attribute values for year_extracted and week_extracted, otherwise i get >>> error form above. >>> >>> Tom >>> >>> from datetime import datetime, timedelta, date >>> >>> flowFile = session.get() >>> file_name = flowFile.getAttribute('filename') >>> >>> date_file = file_name.split("_")[6] >>> date_final = date_file.split(".")[0] >>> date_obj = datetime.strptime(date_final,'%y%m%d') >>> date_year = date_obj.year >>> date_day = date_obj.day >>> date_month = date_obj.month >>> >>> week = date(year=date_year, month=date_month, day=date_day).isocalendar >>> ()[1] >>> year = date(year=date_year, month=date_month, day=date_day).isocalendar >>> ()[0] >>> >>> if (flowFile != None): >>> flowFile = session.putAttribute(flowFile, "year_extracted", year) >>> flowFile = session.putAttribute(flowFile, "week_extracted", week) >>> session.transfer(flowFile, REL_SUCCESS) >>> session.commit() >>> >>> On Tue, 29 Jan 2019 at 15:53, Matt Burgess <mattyb...@apache.org> wrote: >>> >>>> Tom, >>>> >>>> Keep in mind that you are using Jython not Python, which I mention >>>> only to point out that it is *much* slower than the native Java >>>> processors such as UpdateAttribute, and slower than other scripting >>>> engines such as Groovy or Javascript/Nashorn. >>>> >>>> If performance/throughput is not a concern and you're more comfortable >>>> with Jython, then Jerry's suggestion of session.putAttribute(flowFile, >>>> attributeName, attributeValue) should do the trick. Note that if you >>>> are adding more than a couple attributes, it's probably better to >>>> create a dictionary (eventually/actually, a Java Map<String,String>) >>>> of attribute name/value pairs, and use putAllAttributes(flowFile, >>>> attributes) instead, as it is more performant. >>>> >>>> Regards, >>>> Matt >>>> >>>> On Tue, Jan 29, 2019 at 9:25 AM Tomislav Novosel <to.novo...@gmail.com> >>>> wrote: >>>> > >>>> > Thanks for the answer. >>>> > >>>> > Yes I know I can handle that with Expression language and >>>> UpdateAttribute processor, but this is specific case on my work and I think >>>> Python >>>> > is better and more simple solution. I need to calc that with python >>>> script. >>>> > >>>> > Tom >>>> > >>>> > On Tue, 29 Jan 2019 at 15:18, John McGinn <amruginn-n...@yahoo.com> >>>> wrote: >>>> >> >>>> >> Since you're script shows that "filename" is an attribute of your >>>> flowfile, you could use the UpdateAttribute processor. >>>> >> >>>> >> If you right click on UpdateAttribute and choose ShowUsage, then >>>> choose Expression Language Guide, it shows you the things you can handle. >>>> >> >>>> >> Something along the lines of ${filename:getDelimitedField(6,'_')}, >>>> if I understand the Groovy code correctly. I did a GenerateFlowFIle to an >>>> UpdateAttribute processor setting filename to "1_2_3_4_5_6.2_abc", then >>>> sent that to another UpdateAttribute with the getDelimitedField() I listed >>>> and I received 6.2. Then another UpdateAttribute could parse the 6.2 for >>>> the second substring, or you might be able to chain them in the existing >>>> UpdateProcessor. >>>> >> >>>> >> >>>> >> -------------------------------------------- >>>> >> On Tue, 1/29/19, Tomislav Novosel <to.novo...@gmail.com> wrote: >>>> >> >>>> >> Subject: Modify Flowfile attributes >>>> >> To: users@nifi.apache.org >>>> >> Date: Tuesday, January 29, 2019, 9:04 AM >>>> >> >>>> >> Hi all, >>>> >> I'm trying to calculate week number and date >>>> >> from filename using ExecuteScript processor and Jython. Here >>>> >> is python script.How can I add calculated >>>> >> attributes week and year to flowfile? >>>> >> Please help, thank you.Tom >>>> >> P.S. Maybe I completely missed with this script. >>>> >> Feel free to correct me. >>>> >> >>>> >> import >>>> >> jsonimport java.iofrom org.apache.commons.io import >>>> >> IOUtilsfrom java.nio.charset import >>>> >> StandardCharsetsfrom org.apache.nifi.processor.io import >>>> >> StreamCallbackfrom datetime import datetime, timedelta, date >>>> >> class PyStreamCallback(StreamCallback): >>>> >> def __init__(self, flowfile): >>>> >> self.ff = flowfile >>>> >> pass >>>> >> def process(self, inputStream, outputStream): >>>> >> file_name = >>>> >> self.ff.getAttribute("filename") >>>> >> date_file = >>>> >> file_name.split("_")[6] >>>> >> date_final = >>>> >> date_file.split(".")[0] >>>> >> date_obj = >>>> >> datetime.strptime(date_final,'%y%m%d') >>>> >> date_year = >>>> >> date_obj.year >>>> >> date_day = >>>> >> date_obj.day >>>> >> date_month = >>>> >> date_obj.month >>>> >> week = date(year=date_year, month=date_month, >>>> day=date_day).isocalendar()[1] >>>> >> year = >>>> >> date(year=date_year, month=date_month, >>>> day=date_day).isocalendar()[0] >>>> >> flowFile = >>>> >> session.get()if (flowFile != None): >>>> >> session.transfer(flowFile, REL_SUCCESS) >>>> >> session.commit() >>>> >>> >> >> -- >> http://www.google.com/profiles/grapesmoker >> >