Re: HashingTFModel/IDFModel in Structured Streaming

2017-11-14 Thread Bago Amirbekian
There is a known issue with VectorAssembler which causes it to fail in streaming if any of the input columns are of VectorType & don't have size information, https://issues.apache.org/jira/browse/SPARK-22346. This can be fixed by adding size information to the vector columns, I've made a PR to add

Re: HashingTFModel/IDFModel in Structured Streaming

2017-11-09 Thread Bago Amirbekian
Davis, were you able to find an example? Anything you have could help help. On Wed, Nov 1, 2017 at 8:53 PM Davis Varghese wrote: > Sure. I will get one over the weekend > > > > -- > Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ > >

Re: HashingTFModel/IDFModel in Structured Streaming

2017-11-01 Thread Bago Amirbekian
Davis I'm looking into this. If you could include some code that I can use to reproduce the error & the stack trace it would be really helpful. On Fri, Oct 20, 2017 at 11:01 AM Joseph Bradley wrote: > Hi Davis, > We've started tracking these issues under this umbrella: > https://issues.apache.or