Hi,
I'm using sqlContext.jdbc(uri, table, where).map(_ =
1).aggregate(0)(_+_,_+_) on an interactive shell (where where is an
Array[String] of 32 to 48 elements). (The code is tailored to your db,
specifically through the where conditions, I'd have otherwise post it)
That should be the DataFrame
Never mind my comment about 3. You were talking about the read side, while
I was thinking about the write side. Your workaround actually is a pretty
good idea. Can you create a JIRA for that as well?
On Monday, June 1, 2015, Reynold Xin r...@databricks.com wrote:
René,
Thanks for sharing your
Hi *,
I used to run into a few problems with the jdbc/mysql integration and
thought it would be nice to load our whole db, doing nothing but .map(_ =
1).aggregate(0)(_+_,_+_) on the DataFrames.
SparkSQL has to load all columns and process them so this should reveal
type errors like
SPARK-7897
Is this backported to branch 1.3?
On 31 May 2015, at 00:44, Reynold Xin
r...@databricks.commailto:r...@databricks.com wrote:
FYI we merged a patch that improves unit test log debugging. In order for that
to work, all test suites have been changed to extend SparkFunSuite instead of
ScalaTest's
René,
Thanks for sharing your experience. Are you using the DataFrame API or SQL?
(1) Any recommendations on what we do w.r.t. out of range values? Should we
silently turn them into a null? Maybe based on an option?
(2) Looks like a good idea to always quote column names. The small tricky
thing
Hello,
Someone proposed in a Jira issue to implement new graph operations. Sean
Owen recommended to check first with the mailing list, if this is
interesting or not.
So I would like to know, if it is interesting for GraphX to implement the
operators like:
I don't think so.
On Monday, June 1, 2015, Steve Loughran ste...@hortonworks.com wrote:
Is this backported to branch 1.3?
On 31 May 2015, at 00:44, Reynold Xin r...@databricks.com
javascript:_e(%7B%7D,'cvml','r...@databricks.com'); wrote:
FYI we merged a patch that improves unit test
Still have problem using HiveContext from sbt. Here’s an example of
dependencies:
|val sparkVersion = 1.4.0-rc3 lazy val root = Project(id =
spark-hive, base = file(.), settings = Project.defaultSettings ++
Seq( name := spark-1.4-hive, scalaVersion := 2.10.5,
scalaBinaryVersion := 2.10,
Hi Peter,
Based on your error message, seems you were not using the RC3. For the
error thrown at HiveContext's line 206, we have changed the message to this
one
https://github.com/apache/spark/blob/v1.4.0-rc3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala#L205-207
just
before
It will be within the next few days
2015-06-01 9:17 GMT-07:00 Reynold Xin r...@databricks.com:
I don't think so.
On Monday, June 1, 2015, Steve Loughran ste...@hortonworks.com wrote:
Is this backported to branch 1.3?
On 31 May 2015, at 00:44, Reynold Xin r...@databricks.com wrote:
+1 (binding)
Tested the standalone cluster mode REST submission gateway - submit /
status / kill
Tested simple applications on YARN client / cluster modes with and without
--jars
Tested python applications on YARN client / cluster modes with and without
--py-files*
Tested dynamic allocation on
Thanks, René. I actually added a warning to the new JDBC reader/writer
interface for 1.4.0.
Even with that, I think we should support throttling JDBC; otherwise it's
too convenient for our users to DOS their production database servers!
/**
* Construct a [[DataFrame]] representing the
Thanks Yin, tried on a clean VM - works now. But tests in my app still
fails:
|[info] Cause: javax.jdo.JDOFatalDataStoreException: Unable to open a
test connection to the given database. JDBC url =
jdbc:derby:;databaseName=metastore_db;create=true, username = APP.
Terminating connection pool
Its no longer valid to start more than one instance of HiveContext in a
single JVM, as one of the goals of this refactoring was to allow connection
to more than one metastore from a single context.
For tests I suggest you use TestHive as we do in our unit tests. It has a
reset() method you can
I get a bunch of failures in VersionSuite with build/test params
-Pyarn -Phive -Phadoop-2.6:
- success sanity check *** FAILED ***
java.lang.RuntimeException: [download failed:
org.jboss.netty#netty;3.2.2.Final!netty.jar(bundle), download failed:
commons-net#commons-net;3.1!commons-net.jar]
Hive Context works on RC3 for Mapr after adding
spark.sql.hive.metastore.sharedPrefixes as suggested in SPARK-7819
https://issues.apache.org/jira/browse/SPARK-7819. However, there still
seems to be some other issues with native libraries, i get below warning
WARN NativeCodeLoader: Unable to load
Hi there,
I noticed in the latest Spark SQL programming guide
https://spark.apache.org/docs/latest/sql-programming-guide.html , there is
support for optimized reading of partitioned Parquet files that have a
particular directory structure (year=1/month=10/day=3, for example).
However, I see no
Hey Bobby,
Those are generic warnings that the hadoop libraries throw. If you are
using MapRFS they shouldn't matter since you are using the MapR client
and not the default hadoop client.
Do you have any issues with functionality... or was it just seeing the
warnings that was the concern?
There will be in 1.4.
df.write.partitionBy(year, month, day).parquet(/path/to/output)
On Mon, Jun 1, 2015 at 10:21 PM, Matt Cheah mch...@palantir.com wrote:
Hi there,
I noticed in the latest Spark SQL programming guide
https://spark.apache.org/docs/latest/sql-programming-guide.html, there
Hi Tim,
(added dev, removed user)
I've created https://issues.apache.org/jira/browse/SPARK-8009 to track this.
-kr, Gerard.
On Sat, May 30, 2015 at 7:10 PM, Tim Chen t...@mesosphere.io wrote:
So sounds like some generic downloadable uris support can solve this
problem, that Mesos
Hi Patrick,
Thanks for clarifying. No issues with functionality.
+1 (non-binding)
Thanks
Bobby
On Mon, Jun 1, 2015 at 9:41 PM, Patrick Wendell pwend...@gmail.com wrote:
Hey Bobby,
Those are generic warnings that the hadoop libraries throw. If you are
using MapRFS they
21 matches
Mail list logo