Yes, it works now as expected. It seems that *strptime* does not work as it should. I also tried with *strftime*, and Nifi gives me an error that method does not exist. The reason is to avoid hardcoded string "20" for year definition:
week = date(year=int(*"20"*+date_final[0:2]), month=int(date_final[2:4]), day=int(date_final[4:6])).isocalendar()[1] date_file = file_name.split("_")[6] date_final = date_file.split(".")[0] date_obj = datetime.strptime(date_final,'%y%m%d') date_string = date_obj.strftime('%Y%m%d') year_sliced = int(date_string[0:4]) month_sliced = int(date_string[4:6]) day_sliced = int(date_string[6:8]) week = date(year=year_sliced, month=month_sliced, day=day_sliced). isocalendar()[1] year = date(year=year_sliced, month=month_sliced, day=day_sliced). isocalendar()[0] So this is not nice solution but it works. Thank you very much Arpad for all the answers, this will be ok for now. I also appreciate answers and help of all the others. Regards, Tom On Wed, 30 Jan 2019 at 13:38, Arpad Boda <ab...@hortonworks.com> wrote: > I know it’s a hack, but as your date format is fixed length (6 chars), the > following should work: > > > > *week_att = **date(year=int(date_final[0:1]), > month=int(date_final[2:3]), day=int(date_final[4:5])).isocalendar()[1]* > > > > Yeah, it’s not a solution, but I wonder if skipping strptime fixes the > problem or not. > > > > *From: *Tomislav Novosel <to.novo...@gmail.com> > *Reply-To: *"users@nifi.apache.org" <users@nifi.apache.org> > *Date: *Wednesday, 30 January 2019 at 13:31 > *To: *"users@nifi.apache.org" <users@nifi.apache.org> > *Subject: *Re: Modify Flowfile attributes > > > > Yes, it is strange. > > > > If I do this: > > date_obj = datetime.strptime(‘20’ + date_final,'%Y%m%d') > > > > The result is the same. > > It gives me week 44 and year 118, but if I run code locally it gives > correct week and year. Week 1 and year 2019. > > > > Tom. > > > > On Wed, 30 Jan 2019 at 13:24, Arpad Boda <ab...@hortonworks.com> wrote: > > This sounds very strange. > > > > What happens if you do this: > > > > date_obj = datetime.strptime(‘20’ + date_final,'%Y%m%d') > > > > ? > > > > > > *From: *Tomislav Novosel <to.novo...@gmail.com> > *Reply-To: *"users@nifi.apache.org" <users@nifi.apache.org> > *Date: *Wednesday, 30 January 2019 at 12:44 > *To: *"users@nifi.apache.org" <users@nifi.apache.org> > *Subject: *Re: Modify Flowfile attributes > > > > Arpad, > > > > I also tested my python code as standalone locally on my laptop, and the > results are as expected, 1st week of 2019 for > > date 181231 which is my case. > > > > I also tried to add two variables marked as red as attributes to my > flowfile, and the result is as expected, date_file > > has value 181231.parquet (parquet file is my case) and date_final has > value 181231. > > > > So red variables are not the problem. > > > > Problem is in variables marked red: > > > > date_file = file_name.split("_")[6] > > date_final = date_file.split(".")[0] > > date_obj = datetime.strptime(date_final,'%y%m%d') > > date_year = date_obj.year > > date_day = date_obj.day > > date_month = date_obj.month > > > > So python code runs correct locally, but on Nifi (jython) does not. > > > > Regards, > > Tom > > > > > > On Wed, 30 Jan 2019 at 12:33, Arpad Boda <ab...@hortonworks.com> wrote: > > Tom, > > > > Not sure we are on the same page. > > > > I tested the python code of yours as standalone, not in NiFi. > > > > As the Python code is fine (even with JPython), I think the issue is > somewhere here: > > > > * date_file = file_name.split("_")[6]* > > * date_final = date_file.split(".")[0]* > > * date_obj = datetime.strptime(date_final,'%y%m%d')* > > My testing assumed “date_final” to be “181231”, which I guess doesn’t > apply for your case. > > > > Could you *modify* your python code to add the two variables (marked red) > as attributes to your flow file? > > > > Regards, > > Arpad > > > > > > > > *From: *Tomislav Novosel <to.novo...@gmail.com> > *Reply-To: *"users@nifi.apache.org" <users@nifi.apache.org> > *Date: *Wednesday, 30 January 2019 at 12:20 > *To: *"users@nifi.apache.org" <users@nifi.apache.org> > *Subject: *Re: Modify Flowfile attributes > > > > Arpad, > > > > I tried to pass variables date_year, date_day and date_month to outgoing > flowfile and I get unexpected values. > > For day I get 1, for year 118 and for month 11. > > And that gives week number 44 and year 118 according to my code. > > > > It is strange that my code works as expected on your machine. I use Nifi > 1.7.1 > > > > Regards, > > Tom > > > > On Wed, 30 Jan 2019 at 11:25, Arpad Boda <ab...@hortonworks.com> wrote: > > Tom, > > > > Could you use logattribute processor and somehow log the value of your > “date_final” variables? > > > > Tested your code with Jpython, with input string “181231” it works as > expected (the result is 1st week of 2019). > > > > *From: *Tomislav Novosel <to.novo...@gmail.com> > *Reply-To: *"users@nifi.apache.org" <users@nifi.apache.org> > *Date: *Wednesday, 30 January 2019 at 11:10 > *To: *"users@nifi.apache.org" <users@nifi.apache.org> > *Subject: *Re: Modify Flowfile attributes > > > > Yes, the values are correct. Attribute has value which is expected to be. > > i.e. for date 181231 in filename I get value 18231 for attribute > week_extracted which is extracted from filename with split method. > > > > Tom. > > > > On Wed, 30 Jan 2019 at 10:59, Arpad Boda <ab...@hortonworks.com> wrote: > > Hi Tom, > > > > “that is exactly what I tried and date_final or date_file are applied to > the attribute of outgoing flowfile, it works.” > > > > It works as they are strings, so not working would be a surprise. The > question is: what are their values? 😊 > > > > Regards, > > Arpad > > > > *From: *Tomislav Novosel <to.novo...@gmail.com> > *Reply-To: *"users@nifi.apache.org" <users@nifi.apache.org> > *Date: *Wednesday, 30 January 2019 at 10:53 > *To: *"users@nifi.apache.org" <users@nifi.apache.org> > *Subject: *Re: Modify Flowfile attributes > > > > Hi Arpad, > > > > that is exactly what I tried and date_final or date_file are applied to > the attribute of outgoing flowfile, it works. > > But if I put to attribute week_att, there is error: week_att cannot be > coerced as String, and if I put str_week it gives me week number 44. > > > > Tom > > > > On Wed, 30 Jan 2019 at 08:40, Arpad Boda <ab...@hortonworks.com> wrote: > > Tom, > > > > The Python code to get the week number for a datetime string seems to be > correct. > > > > To help debugging could you stamp your “date_final” or “date_file” > variable to an attribute, so we could see what’s the input? > > My gut feeling says there is some parsing magic going wrong here. > > > > Regards, > > Arpad > > > > *From: *Tomislav Novosel <to.novo...@gmail.com> > *Reply-To: *"users@nifi.apache.org" <users@nifi.apache.org> > *Date: *Tuesday, 29 January 2019 at 20:13 > *To: *"users@nifi.apache.org" <users@nifi.apache.org> > *Subject: *Re: Modify Flowfile attributes > > > > With following script I get week number 44 and year 118, which is strange > result. > Week should be 1 and year 2019 for date 2018-31-12. > > What is wrong here? > > > > Tom > > > > from datetime import datetime, timedelta, date > > > > flowFile = session.get() > > if (flowFile != None): > > file_name = flowFile.getAttribute('filename') > > > > date_file = file_name.split("_")[6] > > date_final = date_file.split(".")[0] > > date_obj = datetime.strptime(date_final,'%y%m%d') > > date_year = date_obj.year > > date_day = date_obj.day > > date_month = date_obj.month > > > > week_att = date(year=date_year, month=date_month, > day=date_day).isocalendar()[1] > > year_att = date(year=date_year, month=date_month, > day=date_day).isocalendar()[0] > > str_week = str(week_att) > > str_year = str(year_att) > > > > flowFile = session.putAttribute(flowFile, "year_extracted", str_year) > > flowFile = session.putAttribute(flowFile, "week_extracted", str_week) > > session.transfer(flowFile, REL_SUCCESS) > > session.commit() > > > > On Tue, 29 Jan 2019 at 16:59, Tomislav Novosel <to.novo...@gmail.com> > wrote: > > Thank you all for answers. The reason why I want this to do with python > script is wrong calculation of week number from date. Nifi has that > function in expression lang. (extracted_date:format("w", <<time_zone>>)). > My time zone is GMT+2. > > If i set date, for example 20180819, and time zone in function GMT I get > week number 34, which is wrong. If I ommit time zone, I get week number 33, > which is right. I'm not sure if thats bug. You can test it for yourself, > and if you do, please share your findings here, maybe I'm doing something > wrong. > > > > On the other side, if I use python, I'more sure that I will get correct > week number, even for dates which overlaps with week number in next > year(e.g. 20181231) > > > > Since this calc will be in production, I need resilient workflow in the > future without errors. > > > > Regarding script I sent above, I'm getting error: "week cannot bo coerced > as string". I checked right on the beginning if the session is null or not. > > > > On Tue, 29 Jan 2019, 16:26 Jerry Vinokurov <grapesmo...@gmail.com wrote: > > I wanted to add, since I've done this specific operation many times, that > you can really just do this via the NiFi expression language, which I think > is more "idiomatic" than having ExecuteScript processors all over the > place. Basically, you would have an UpdateAttribute that set something > called, say, date_extracted with an expression that looks something like > ${filename:substringAfterLast('_'):toDate('yyyy.MM.dd')} (this is an > approximation based on the above, modify as necessary for your purpose). > Then you could use a second UpdateAttribute to extract various information > from this date with the format command, e.g. ${date_extracted:format('<your > format expression here>')}. I don't think there's one for "week" but in > general this is the approach I take when I need to do date munging. > > > > On Tue, Jan 29, 2019 at 10:06 AM Tomislav Novosel <to.novo...@gmail.com> > wrote: > > Hi Matt, thanks for suggestions. But performance is not crucial here. > > This is code i tried. but I get error: "AttributeError: 'NoneType' object > has no attribute 'getAttribute' at line number 4" > > If I remove code from line 6 to line 14, it works with some default > attribute values for year_extracted and week_extracted, otherwise i get > > error form above. > > > > Tom > > > > from datetime import datetime, timedelta, date > > > > flowFile = session.get() > > file_name = flowFile.getAttribute('filename') > > > > date_file = file_name.split("_")[6] > > date_final = date_file.split(".")[0] > > date_obj = datetime.strptime(date_final,'%y%m%d') > > date_year = date_obj.year > > date_day = date_obj.day > > date_month = date_obj.month > > > > week = date(year=date_year, month=date_month, day=date_day).isocalendar()[ > 1] > > year = date(year=date_year, month=date_month, day=date_day).isocalendar()[ > 0] > > > > if (flowFile != None): > > flowFile = session.putAttribute(flowFile, "year_extracted", year) > > flowFile = session.putAttribute(flowFile, "week_extracted", week) > > session.transfer(flowFile, REL_SUCCESS) > > session.commit() > > > > On Tue, 29 Jan 2019 at 15:53, Matt Burgess <mattyb...@apache.org> wrote: > > Tom, > > Keep in mind that you are using Jython not Python, which I mention > only to point out that it is *much* slower than the native Java > processors such as UpdateAttribute, and slower than other scripting > engines such as Groovy or Javascript/Nashorn. > > If performance/throughput is not a concern and you're more comfortable > with Jython, then Jerry's suggestion of session.putAttribute(flowFile, > attributeName, attributeValue) should do the trick. Note that if you > are adding more than a couple attributes, it's probably better to > create a dictionary (eventually/actually, a Java Map<String,String>) > of attribute name/value pairs, and use putAllAttributes(flowFile, > attributes) instead, as it is more performant. > > Regards, > Matt > > On Tue, Jan 29, 2019 at 9:25 AM Tomislav Novosel <to.novo...@gmail.com> > wrote: > > > > Thanks for the answer. > > > > Yes I know I can handle that with Expression language and > UpdateAttribute processor, but this is specific case on my work and I think > Python > > is better and more simple solution. I need to calc that with python > script. > > > > Tom > > > > On Tue, 29 Jan 2019 at 15:18, John McGinn <amruginn-n...@yahoo.com> > wrote: > >> > >> Since you're script shows that "filename" is an attribute of your > flowfile, you could use the UpdateAttribute processor. > >> > >> If you right click on UpdateAttribute and choose ShowUsage, then choose > Expression Language Guide, it shows you the things you can handle. > >> > >> Something along the lines of ${filename:getDelimitedField(6,'_')}, if I > understand the Groovy code correctly. I did a GenerateFlowFIle to an > UpdateAttribute processor setting filename to "1_2_3_4_5_6.2_abc", then > sent that to another UpdateAttribute with the getDelimitedField() I listed > and I received 6.2. Then another UpdateAttribute could parse the 6.2 for > the second substring, or you might be able to chain them in the existing > UpdateProcessor. > >> > >> > >> -------------------------------------------- > >> On Tue, 1/29/19, Tomislav Novosel <to.novo...@gmail.com> wrote: > >> > >> Subject: Modify Flowfile attributes > >> To: users@nifi.apache.org > >> Date: Tuesday, January 29, 2019, 9:04 AM > >> > >> Hi all, > >> I'm trying to calculate week number and date > >> from filename using ExecuteScript processor and Jython. Here > >> is python script.How can I add calculated > >> attributes week and year to flowfile? > >> Please help, thank you.Tom > >> P.S. Maybe I completely missed with this script. > >> Feel free to correct me. > >> > >> import > >> jsonimport java.iofrom org.apache.commons.io import > >> IOUtilsfrom java.nio.charset import > >> StandardCharsetsfrom org.apache.nifi.processor.io import > >> StreamCallbackfrom datetime import datetime, timedelta, date > >> class PyStreamCallback(StreamCallback): > >> def __init__(self, flowfile): > >> self.ff = flowfile > >> pass > >> def process(self, inputStream, outputStream): > >> file_name = > >> self.ff.getAttribute("filename") > >> date_file = > >> file_name.split("_")[6] > >> date_final = > >> date_file.split(".")[0] > >> date_obj = > >> datetime.strptime(date_final,'%y%m%d') > >> date_year = > >> date_obj.year > >> date_day = > >> date_obj.day > >> date_month = > >> date_obj.month > >> week = date(year=date_year, month=date_month, > day=date_day).isocalendar()[1] > >> year = > >> date(year=date_year, month=date_month, day=date_day).isocalendar()[0] > >> flowFile = > >> session.get()if (flowFile != None): > >> session.transfer(flowFile, REL_SUCCESS) > >> session.commit() > > > > -- > > http://www.google.com/profiles/grapesmoker > >