Re: [PySpark] Tagging descriptions

2020-05-14 Thread Amol Umbarkar
appreciate all the help! > > Thanks, > -Rishi > > > On Thu, May 14, 2020 at 6:11 AM Amol Umbarkar > wrote: > >> Rishi, >> Just adding to zhang's questions. >> >> Are you expecting multiple tags per row? >> Do you check multiple regex f

Re: [PySpark] Tagging descriptions

2020-05-14 Thread Amol Umbarkar
Rishi, Just adding to zhang's questions. Are you expecting multiple tags per row? Do you check multiple regex for a single tag? Let's say you had only one tag then theoretically you should be do this - 1 Remove stop words or any irrelevant stuff 2 split text into equal sized chunk column (eg -