Re: Spark SQL Transaction

2016-04-23 Thread Andrés Ivaldi
Thanks, I'll take a look to JdbcUtils

regards.

On Sat, Apr 23, 2016 at 2:57 PM, Todd Nist  wrote:

> I believe the class you are looking for is
> org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala.
>
> By default in savePartition(...) , it will do the following:
>
> if (supportsTransactions) { conn.setAutoCommit(false) // Everything in
> the same db transaction. } Then at line 224, it will issue the commit:
> if (supportsTransactions) { conn.commit() } HTH -Todd
>
> On Sat, Apr 23, 2016 at 8:57 AM, Andrés Ivaldi  wrote:
>
>> Hello, so I executed Profiler and found that implicit isolation was turn
>> on by JDBC driver, this is the default behavior of MSSQL JDBC driver, but
>> it's possible change it with setAutoCommit method. There is no property for
>> that so I've to do it in the code, do you now where can I access to the
>> instance of JDBC class used by Spark on DataFrames?
>>
>> Regards.
>>
>> On Thu, Apr 21, 2016 at 10:59 AM, Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> This statement
>>>
>>> ."..each database statement is atomic and is itself a transaction.. your
>>> statements should be atomic and there will be no ‘redo’ or ‘commit’ or
>>> ‘rollback’."
>>>
>>> MSSQL compiles with ACIDITY which requires that each transaction be "all
>>> or nothing": if one part of the transaction fails, then the entire
>>> transaction fails, and the database state is left unchanged.
>>>
>>> Assuming that it is one transaction (through much doubt if JDBC does
>>> that as it will take for ever), then either that transaction commits (in
>>> MSSQL redo + undo are combined in syslogs table of the database) meaning
>>> there will be undo + redo log generated  for that row only in syslogs. So
>>> under normal operation every RDBMS including MSSQL, Oracle, Sybase and
>>> others will comply with generating (redo and undo) and one cannot avoid it.
>>> If there is a batch transaction as I suspect in this case, it is either all
>>> or nothing. The thread owner indicated that rollback is happening so it is
>>> consistent with all rows rolled back.
>>>
>>> I don't think Spark, Sqoop, Hive can influence the transaction behaviour
>>> of an RDBMS for DML. DQ (data queries) do not generate transactions.
>>>
>>> HTH
>>>
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> *
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>> On 21 April 2016 at 13:58, Michael Segel 
>>> wrote:
>>>
 Hi,

 Sometimes terms get muddled over time.

 If you’re not using transactions, then each database statement is
 atomic and is itself a transaction.
 So unless you have some explicit ‘Begin Work’ at the start…. your
 statements should be atomic and there will be no ‘redo’ or ‘commit’ or
 ‘rollback’.

 I don’t see anything in Spark’s documentation about transactions, so
 the statements should be atomic.  (I’m not a guru here so I could be
 missing something in Spark)

 If you’re seeing the connection drop unexpectedly and then a rollback,
 could this be a setting or configuration of the database?


 > On Apr 19, 2016, at 1:18 PM, Andrés Ivaldi 
 wrote:
 >
 > Hello, is possible to execute a SQL write without Transaction? we
 dont need transactions to save our data and this adds an overhead to the
 SQLServer.
 >
 > Regards.
 >
 > --
 > Ing. Ivaldi Andres


 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org


>>>
>>
>>
>> --
>> Ing. Ivaldi Andres
>>
>
>


-- 
Ing. Ivaldi Andres


Re: Spark SQL Transaction

2016-04-23 Thread Todd Nist
I believe the class you are looking for is
org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala.

By default in savePartition(...) , it will do the following:

if (supportsTransactions) { conn.setAutoCommit(false) // Everything in the
same db transaction. } Then at line 224, it will issue the commit:
if (supportsTransactions) { conn.commit() } HTH -Todd

On Sat, Apr 23, 2016 at 8:57 AM, Andrés Ivaldi  wrote:

> Hello, so I executed Profiler and found that implicit isolation was turn
> on by JDBC driver, this is the default behavior of MSSQL JDBC driver, but
> it's possible change it with setAutoCommit method. There is no property for
> that so I've to do it in the code, do you now where can I access to the
> instance of JDBC class used by Spark on DataFrames?
>
> Regards.
>
> On Thu, Apr 21, 2016 at 10:59 AM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> This statement
>>
>> ."..each database statement is atomic and is itself a transaction.. your
>> statements should be atomic and there will be no ‘redo’ or ‘commit’ or
>> ‘rollback’."
>>
>> MSSQL compiles with ACIDITY which requires that each transaction be "all
>> or nothing": if one part of the transaction fails, then the entire
>> transaction fails, and the database state is left unchanged.
>>
>> Assuming that it is one transaction (through much doubt if JDBC does that
>> as it will take for ever), then either that transaction commits (in MSSQL
>> redo + undo are combined in syslogs table of the database) meaning
>> there will be undo + redo log generated  for that row only in syslogs. So
>> under normal operation every RDBMS including MSSQL, Oracle, Sybase and
>> others will comply with generating (redo and undo) and one cannot avoid it.
>> If there is a batch transaction as I suspect in this case, it is either all
>> or nothing. The thread owner indicated that rollback is happening so it is
>> consistent with all rows rolled back.
>>
>> I don't think Spark, Sqoop, Hive can influence the transaction behaviour
>> of an RDBMS for DML. DQ (data queries) do not generate transactions.
>>
>> HTH
>>
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 21 April 2016 at 13:58, Michael Segel 
>> wrote:
>>
>>> Hi,
>>>
>>> Sometimes terms get muddled over time.
>>>
>>> If you’re not using transactions, then each database statement is atomic
>>> and is itself a transaction.
>>> So unless you have some explicit ‘Begin Work’ at the start…. your
>>> statements should be atomic and there will be no ‘redo’ or ‘commit’ or
>>> ‘rollback’.
>>>
>>> I don’t see anything in Spark’s documentation about transactions, so the
>>> statements should be atomic.  (I’m not a guru here so I could be missing
>>> something in Spark)
>>>
>>> If you’re seeing the connection drop unexpectedly and then a rollback,
>>> could this be a setting or configuration of the database?
>>>
>>>
>>> > On Apr 19, 2016, at 1:18 PM, Andrés Ivaldi  wrote:
>>> >
>>> > Hello, is possible to execute a SQL write without Transaction? we dont
>>> need transactions to save our data and this adds an overhead to the
>>> SQLServer.
>>> >
>>> > Regards.
>>> >
>>> > --
>>> > Ing. Ivaldi Andres
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>
>
> --
> Ing. Ivaldi Andres
>


Re: Spark SQL Transaction

2016-04-23 Thread Mich Talebzadeh
In your JDBC connection you can do

conn.commit();

or conn.rollback()

Why don't insert your data into #table in MSSQL and from there do one
insert/select into the main table. That is from ETL. In that case your main
table will be protected. Either it will have full data or no data.

Also have you specified the max packet size in JDBC to load into MSSQL
table. That will improve the speed.

Try experimenting by creating a CSV type file and use bulk load with
autocommit say every 10,000 rows into MSSQL table. That will tell you if
there is any issue. Ask the DBA to provide you with max packet size etc.
There is another limitation which would be the size of transaction log in
MSSQL database getting full.

HTH



Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com



On 23 April 2016 at 13:57, Andrés Ivaldi  wrote:

> Hello, so I executed Profiler and found that implicit isolation was turn
> on by JDBC driver, this is the default behavior of MSSQL JDBC driver, but
> it's possible change it with setAutoCommit method. There is no property for
> that so I've to do it in the code, do you now where can I access to the
> instance of JDBC class used by Spark on DataFrames?
>
> Regards.
>
> On Thu, Apr 21, 2016 at 10:59 AM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> This statement
>>
>> ."..each database statement is atomic and is itself a transaction.. your
>> statements should be atomic and there will be no ‘redo’ or ‘commit’ or
>> ‘rollback’."
>>
>> MSSQL compiles with ACIDITY which requires that each transaction be "all
>> or nothing": if one part of the transaction fails, then the entire
>> transaction fails, and the database state is left unchanged.
>>
>> Assuming that it is one transaction (through much doubt if JDBC does that
>> as it will take for ever), then either that transaction commits (in MSSQL
>> redo + undo are combined in syslogs table of the database) meaning
>> there will be undo + redo log generated  for that row only in syslogs. So
>> under normal operation every RDBMS including MSSQL, Oracle, Sybase and
>> others will comply with generating (redo and undo) and one cannot avoid it.
>> If there is a batch transaction as I suspect in this case, it is either all
>> or nothing. The thread owner indicated that rollback is happening so it is
>> consistent with all rows rolled back.
>>
>> I don't think Spark, Sqoop, Hive can influence the transaction behaviour
>> of an RDBMS for DML. DQ (data queries) do not generate transactions.
>>
>> HTH
>>
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 21 April 2016 at 13:58, Michael Segel 
>> wrote:
>>
>>> Hi,
>>>
>>> Sometimes terms get muddled over time.
>>>
>>> If you’re not using transactions, then each database statement is atomic
>>> and is itself a transaction.
>>> So unless you have some explicit ‘Begin Work’ at the start…. your
>>> statements should be atomic and there will be no ‘redo’ or ‘commit’ or
>>> ‘rollback’.
>>>
>>> I don’t see anything in Spark’s documentation about transactions, so the
>>> statements should be atomic.  (I’m not a guru here so I could be missing
>>> something in Spark)
>>>
>>> If you’re seeing the connection drop unexpectedly and then a rollback,
>>> could this be a setting or configuration of the database?
>>>
>>>
>>> > On Apr 19, 2016, at 1:18 PM, Andrés Ivaldi  wrote:
>>> >
>>> > Hello, is possible to execute a SQL write without Transaction? we dont
>>> need transactions to save our data and this adds an overhead to the
>>> SQLServer.
>>> >
>>> > Regards.
>>> >
>>> > --
>>> > Ing. Ivaldi Andres
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>
>
> --
> Ing. Ivaldi Andres
>


Re: Spark SQL Transaction

2016-04-23 Thread Andrés Ivaldi
Hello, so I executed Profiler and found that implicit isolation was turn on
by JDBC driver, this is the default behavior of MSSQL JDBC driver, but it's
possible change it with setAutoCommit method. There is no property for that
so I've to do it in the code, do you now where can I access to the instance
of JDBC class used by Spark on DataFrames?

Regards.

On Thu, Apr 21, 2016 at 10:59 AM, Mich Talebzadeh  wrote:

> This statement
>
> ."..each database statement is atomic and is itself a transaction.. your
> statements should be atomic and there will be no ‘redo’ or ‘commit’ or
> ‘rollback’."
>
> MSSQL compiles with ACIDITY which requires that each transaction be "all
> or nothing": if one part of the transaction fails, then the entire
> transaction fails, and the database state is left unchanged.
>
> Assuming that it is one transaction (through much doubt if JDBC does that
> as it will take for ever), then either that transaction commits (in MSSQL
> redo + undo are combined in syslogs table of the database) meaning
> there will be undo + redo log generated  for that row only in syslogs. So
> under normal operation every RDBMS including MSSQL, Oracle, Sybase and
> others will comply with generating (redo and undo) and one cannot avoid it.
> If there is a batch transaction as I suspect in this case, it is either all
> or nothing. The thread owner indicated that rollback is happening so it is
> consistent with all rows rolled back.
>
> I don't think Spark, Sqoop, Hive can influence the transaction behaviour
> of an RDBMS for DML. DQ (data queries) do not generate transactions.
>
> HTH
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 21 April 2016 at 13:58, Michael Segel 
> wrote:
>
>> Hi,
>>
>> Sometimes terms get muddled over time.
>>
>> If you’re not using transactions, then each database statement is atomic
>> and is itself a transaction.
>> So unless you have some explicit ‘Begin Work’ at the start…. your
>> statements should be atomic and there will be no ‘redo’ or ‘commit’ or
>> ‘rollback’.
>>
>> I don’t see anything in Spark’s documentation about transactions, so the
>> statements should be atomic.  (I’m not a guru here so I could be missing
>> something in Spark)
>>
>> If you’re seeing the connection drop unexpectedly and then a rollback,
>> could this be a setting or configuration of the database?
>>
>>
>> > On Apr 19, 2016, at 1:18 PM, Andrés Ivaldi  wrote:
>> >
>> > Hello, is possible to execute a SQL write without Transaction? we dont
>> need transactions to save our data and this adds an overhead to the
>> SQLServer.
>> >
>> > Regards.
>> >
>> > --
>> > Ing. Ivaldi Andres
>>
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


-- 
Ing. Ivaldi Andres


Re: Spark SQL Transaction

2016-04-21 Thread Mich Talebzadeh
This statement

."..each database statement is atomic and is itself a transaction.. your
statements should be atomic and there will be no ‘redo’ or ‘commit’ or
‘rollback’."

MSSQL compiles with ACIDITY which requires that each transaction be "all or
nothing": if one part of the transaction fails, then the entire transaction
fails, and the database state is left unchanged.

Assuming that it is one transaction (through much doubt if JDBC does that
as it will take for ever), then either that transaction commits (in MSSQL
redo + undo are combined in syslogs table of the database) meaning
there will be undo + redo log generated  for that row only in syslogs. So
under normal operation every RDBMS including MSSQL, Oracle, Sybase and
others will comply with generating (redo and undo) and one cannot avoid it.
If there is a batch transaction as I suspect in this case, it is either all
or nothing. The thread owner indicated that rollback is happening so it is
consistent with all rows rolled back.

I don't think Spark, Sqoop, Hive can influence the transaction behaviour of
an RDBMS for DML. DQ (data queries) do not generate transactions.

HTH



Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com



On 21 April 2016 at 13:58, Michael Segel  wrote:

> Hi,
>
> Sometimes terms get muddled over time.
>
> If you’re not using transactions, then each database statement is atomic
> and is itself a transaction.
> So unless you have some explicit ‘Begin Work’ at the start…. your
> statements should be atomic and there will be no ‘redo’ or ‘commit’ or
> ‘rollback’.
>
> I don’t see anything in Spark’s documentation about transactions, so the
> statements should be atomic.  (I’m not a guru here so I could be missing
> something in Spark)
>
> If you’re seeing the connection drop unexpectedly and then a rollback,
> could this be a setting or configuration of the database?
>
>
> > On Apr 19, 2016, at 1:18 PM, Andrés Ivaldi  wrote:
> >
> > Hello, is possible to execute a SQL write without Transaction? we dont
> need transactions to save our data and this adds an overhead to the
> SQLServer.
> >
> > Regards.
> >
> > --
> > Ing. Ivaldi Andres
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: Spark SQL Transaction

2016-04-21 Thread Michael Segel
Hi, 

Sometimes terms get muddled over time.

If you’re not using transactions, then each database statement is atomic and is 
itself a transaction. 
So unless you have some explicit ‘Begin Work’ at the start…. your statements 
should be atomic and there will be no ‘redo’ or ‘commit’ or ‘rollback’. 

I don’t see anything in Spark’s documentation about transactions, so the 
statements should be atomic.  (I’m not a guru here so I could be missing 
something in Spark) 

If you’re seeing the connection drop unexpectedly and then a rollback, could 
this be a setting or configuration of the database? 


> On Apr 19, 2016, at 1:18 PM, Andrés Ivaldi  wrote:
> 
> Hello, is possible to execute a SQL write without Transaction? we dont need 
> transactions to save our data and this adds an overhead to the SQLServer.
> 
> Regards.
> 
> -- 
> Ing. Ivaldi Andres


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Spark SQL Transaction

2016-04-20 Thread Mich Talebzadeh
Actually you are correct. It will be considered a non logged operation that
probably they (the DBAs) won't allow it in production. The only option for
the thread owner is to perform smaller batches with frequent commits in
MSSQL


Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 20 April 2016 at 21:52, Strange, Nick  wrote:

> Nologging means no redo log is generated (or minimal redo). However undo
> is still generated and the transaction will still be rolled back in the
> event of an issue.
>
>
>
> Nick
>
>
>
> *From:* Mich Talebzadeh [mailto:mich.talebza...@gmail.com]
> *Sent:* Wednesday, April 20, 2016 4:08 PM
> *To:* Andrés Ivaldi
> *Cc:* user @spark
> *Subject:* Re: Spark SQL Transaction
>
>
>
> Well Oracle will allow that if the underlying table is in NOLOOGING mode :)
>
>
>
> mtale...@mydb12.mich.LOCAL> create table testme(col1 int);
>
> Table created.
>
> mtale...@mydb12.mich.LOCAL> *alter table testme NOLOGGING;*
>
> Table altered.
>
> mtale...@mydb12.mich.LOCAL> insert into testme values(1);
>
> 1 row created.
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn  
> *https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
>
>
> On 20 April 2016 at 20:45, Andrés Ivaldi  wrote:
>
> I think the same, and I don't think reducing batches size improves speed
> but will avoid loosing all data when rollback.
>
>
>
>
>
> Thanks for the help..
>
>
>
>
>
> On Wed, Apr 20, 2016 at 4:03 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
> yep. I think it is not possible to make SQL Server do a non logged
> transaction. Other alternative is doing inserts in small batches if
> possible. Or write to a CSV type file and use Bulk copy to load the file
> into MSSQL with frequent commits like every 50K rows?
>
>
>
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn  
> *https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
>
>
> On 20 April 2016 at 19:42, Andrés Ivaldi  wrote:
>
> Yes, I know that behavior , but there is not explicit Begin Transaction in
> my code, so, maybe Spark or the same driver is adding the begin
> transaction, or implicit transaction is configured. If spark is'n adding a
> Begin transaction on each insertion, then probably is database or Driver
> configuration...
>
>
>
> On Wed, Apr 20, 2016 at 3:33 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>
> You will see what is happening in SQL Server. First create a test table
> called  testme
>
>
>
> 1> use tempdb
> 2> go
> 1> create table testme(col1 int)
> 2> go
>
> -- Now explicitly begin a transaction and insert 1 row and select from
> table
> 1> *begin tran*
> 2> insert into testme values(1)
> 3> select * from testme
> 4> go
> (1 row affected)
>  col1
>  ---
>1
>
> -- That value col1=1 is there
>
> --
>
> (1 row affected)
>
> -- Now rollback that transaction meaning in your case by killing your
> Spark process!
>
> --
> 1> rollback tran
> 2> select * from testme
> 3> go
>  col1
>  ---
>
> (0 rows affected)
>
>
>
> -- You can see that record has gone as it rolled back!
>
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn  
> *https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
>
>
> On 20 April 2016 at 18:42, Andrés Ivaldi  wrote:
>
> Sorry I'cant answer before, I want to know if spark is the responsible to
> add the Begin Tran, The point is to speed up insertion over losing data,
>  Disabling Transaction will speed up the insertion and we dont care about
> consistency... I'll disable te implicit_transaction and see what happens.
>
>
>
> thanks
>
>
>
> On Wed, Apr 20, 2016 at 12:09 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
> Assuming that you are using JDBC for putting

RE: Spark SQL Transaction

2016-04-20 Thread Strange, Nick
Nologging means no redo log is generated (or minimal redo). However undo is 
still generated and the transaction will still be rolled back in the event of 
an issue.

Nick

From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com]
Sent: Wednesday, April 20, 2016 4:08 PM
To: Andrés Ivaldi
Cc: user @spark
Subject: Re: Spark SQL Transaction

Well Oracle will allow that if the underlying table is in NOLOOGING mode :)

mtale...@mydb12.mich.LOCAL<mailto:mtale...@mydb12.mich.LOCAL>> create table 
testme(col1 int);
Table created.
mtale...@mydb12.mich.LOCAL<mailto:mtale...@mydb12.mich.LOCAL>> alter table 
testme NOLOGGING;
Table altered.
mtale...@mydb12.mich.LOCAL<mailto:mtale...@mydb12.mich.LOCAL>> insert into 
testme values(1);
1 row created.


Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/>



On 20 April 2016 at 20:45, Andrés Ivaldi 
mailto:iaiva...@gmail.com>> wrote:
I think the same, and I don't think reducing batches size improves speed but 
will avoid loosing all data when rollback.


Thanks for the help..


On Wed, Apr 20, 2016 at 4:03 PM, Mich Talebzadeh 
mailto:mich.talebza...@gmail.com>> wrote:
yep. I think it is not possible to make SQL Server do a non logged transaction. 
Other alternative is doing inserts in small batches if possible. Or write to a 
CSV type file and use Bulk copy to load the file into MSSQL with frequent 
commits like every 50K rows?




Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/>



On 20 April 2016 at 19:42, Andrés Ivaldi 
mailto:iaiva...@gmail.com>> wrote:
Yes, I know that behavior , but there is not explicit Begin Transaction in my 
code, so, maybe Spark or the same driver is adding the begin transaction, or 
implicit transaction is configured. If spark is'n adding a Begin transaction on 
each insertion, then probably is database or Driver configuration...

On Wed, Apr 20, 2016 at 3:33 PM, Mich Talebzadeh 
mailto:mich.talebza...@gmail.com>> wrote:


You will see what is happening in SQL Server. First create a test table called  
testme

1> use tempdb
2> go
1> create table testme(col1 int)
2> go
-- Now explicitly begin a transaction and insert 1 row and select from table
1> begin tran
2> insert into testme values(1)
3> select * from testme
4> go
(1 row affected)
 col1
 ---
   1
-- That value col1=1 is there
--
(1 row affected)
-- Now rollback that transaction meaning in your case by killing your Spark 
process!
--
1> rollback tran
2> select * from testme
3> go
 col1
 ---
(0 rows affected)

-- You can see that record has gone as it rolled back!



Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/>



On 20 April 2016 at 18:42, Andrés Ivaldi 
mailto:iaiva...@gmail.com>> wrote:
Sorry I'cant answer before, I want to know if spark is the responsible to add 
the Begin Tran, The point is to speed up insertion over losing data,  Disabling 
Transaction will speed up the insertion and we dont care about consistency... 
I'll disable te implicit_transaction and see what happens.

thanks

On Wed, Apr 20, 2016 at 12:09 PM, Mich Talebzadeh 
mailto:mich.talebza...@gmail.com>> wrote:
Assuming that you are using JDBC for putting data into any ACID compliant 
database (MSSQL, Sybase, Oracle etc), you are implicitly or explicitly  adding 
BEGIN TRAN to INSERT statement in a distributed transaction. MSSQL does not 
know or care where data is coming from. If your connection completes OK a 
COMMIT TRAN will be sent and that will tell MSQL to commit transaction. If yoy 
kill Spark transaction before MSSQL receive COMMIT TRAN, the transaction will 
be rolled back.

The only option is that if you don't care about full data getting to MSSQL,to 
break your insert into chunks at source and send data to MSSQL in small 
batches. In that way you will not lose all data in MSSQL because of rollback.

HTH


Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/>



On 20 April 2016 at 07:33, Mich Talebzadeh 
mailto:mich.talebza...@gmail.com>> wrote:
Are you using JDBC to push data to MSSQL?


Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/>



On 19 April 2016 at 23:41, Andrés Ivaldi 
mailto:iaiva...@gmail.com>> wrote:
I mean local tran

Re: Spark SQL Transaction

2016-04-20 Thread Mich Talebzadeh
Well Oracle will allow that if the underlying table is in NOLOOGING mode :)

mtale...@mydb12.mich.LOCAL> create table testme(col1 int);
Table created.
mtale...@mydb12.mich.LOCAL> *alter table testme NOLOGGING;*
Table altered.
mtale...@mydb12.mich.LOCAL> insert into testme values(1);
1 row created.

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com



On 20 April 2016 at 20:45, Andrés Ivaldi  wrote:

> I think the same, and I don't think reducing batches size improves speed
> but will avoid loosing all data when rollback.
>
>
> Thanks for the help..
>
>
> On Wed, Apr 20, 2016 at 4:03 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> yep. I think it is not possible to make SQL Server do a non logged
>> transaction. Other alternative is doing inserts in small batches if
>> possible. Or write to a CSV type file and use Bulk copy to load the file
>> into MSSQL with frequent commits like every 50K rows?
>>
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 20 April 2016 at 19:42, Andrés Ivaldi  wrote:
>>
>>> Yes, I know that behavior , but there is not explicit Begin Transaction
>>> in my code, so, maybe Spark or the same driver is adding the begin
>>> transaction, or implicit transaction is configured. If spark is'n adding a
>>> Begin transaction on each insertion, then probably is database or Driver
>>> configuration...
>>>
>>> On Wed, Apr 20, 2016 at 3:33 PM, Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>

 You will see what is happening in SQL Server. First create a test table
 called  testme

 1> use tempdb
 2> go
 1> create table testme(col1 int)
 2> go
 -- Now explicitly begin a transaction and insert 1 row and select from
 table
 1>
 *begin tran*2> insert into testme values(1)
 3> select * from testme
 4> go
 (1 row affected)
  col1
  ---
1
 -- That value col1=1 is there
 --
 (1 row affected)
 -- Now rollback that transaction meaning in your case by killing your
 Spark process!
 --
 1> rollback tran
 2> select * from testme
 3> go
  col1
  ---
 (0 rows affected)

 -- You can see that record has gone as it rolled back!


 Dr Mich Talebzadeh



 LinkedIn * 
 https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
 *



 http://talebzadehmich.wordpress.com



 On 20 April 2016 at 18:42, Andrés Ivaldi  wrote:

> Sorry I'cant answer before, I want to know if spark is the responsible
> to add the Begin Tran, The point is to speed up insertion over losing 
> data,
>  Disabling Transaction will speed up the insertion and we dont care about
> consistency... I'll disable te implicit_transaction and see what happens.
>
> thanks
>
> On Wed, Apr 20, 2016 at 12:09 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> Assuming that you are using JDBC for putting data into any ACID
>> compliant database (MSSQL, Sybase, Oracle etc), you are implicitly or
>> explicitly  adding BEGIN TRAN to INSERT statement in a distributed
>> transaction. MSSQL does not know or care where data is coming from. If 
>> your
>> connection completes OK a COMMIT TRAN will be sent and that will tell 
>> MSQL
>> to commit transaction. If yoy kill Spark transaction before MSSQL receive
>> COMMIT TRAN, the transaction will be rolled back.
>>
>> The only option is that if you don't care about full data getting to
>> MSSQL,to break your insert into chunks at source and send data to MSSQL 
>> in
>> small batches. In that way you will not lose all data in MSSQL because of
>> rollback.
>>
>> HTH
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 20 April 2016 at 07:33, Mich Talebzadeh > > wrote:
>>
>>> Are you using JDBC to push data to MSSQL?
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> 

Re: Spark SQL Transaction

2016-04-20 Thread Andrés Ivaldi
I think the same, and I don't think reducing batches size improves speed
but will avoid loosing all data when rollback.


Thanks for the help..


On Wed, Apr 20, 2016 at 4:03 PM, Mich Talebzadeh 
wrote:

> yep. I think it is not possible to make SQL Server do a non logged
> transaction. Other alternative is doing inserts in small batches if
> possible. Or write to a CSV type file and use Bulk copy to load the file
> into MSSQL with frequent commits like every 50K rows?
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 20 April 2016 at 19:42, Andrés Ivaldi  wrote:
>
>> Yes, I know that behavior , but there is not explicit Begin Transaction
>> in my code, so, maybe Spark or the same driver is adding the begin
>> transaction, or implicit transaction is configured. If spark is'n adding a
>> Begin transaction on each insertion, then probably is database or Driver
>> configuration...
>>
>> On Wed, Apr 20, 2016 at 3:33 PM, Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>>
>>> You will see what is happening in SQL Server. First create a test table
>>> called  testme
>>>
>>> 1> use tempdb
>>> 2> go
>>> 1> create table testme(col1 int)
>>> 2> go
>>> -- Now explicitly begin a transaction and insert 1 row and select from
>>> table
>>> 1>
>>> *begin tran*2> insert into testme values(1)
>>> 3> select * from testme
>>> 4> go
>>> (1 row affected)
>>>  col1
>>>  ---
>>>1
>>> -- That value col1=1 is there
>>> --
>>> (1 row affected)
>>> -- Now rollback that transaction meaning in your case by killing your
>>> Spark process!
>>> --
>>> 1> rollback tran
>>> 2> select * from testme
>>> 3> go
>>>  col1
>>>  ---
>>> (0 rows affected)
>>>
>>> -- You can see that record has gone as it rolled back!
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> *
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>> On 20 April 2016 at 18:42, Andrés Ivaldi  wrote:
>>>
 Sorry I'cant answer before, I want to know if spark is the responsible
 to add the Begin Tran, The point is to speed up insertion over losing data,
  Disabling Transaction will speed up the insertion and we dont care about
 consistency... I'll disable te implicit_transaction and see what happens.

 thanks

 On Wed, Apr 20, 2016 at 12:09 PM, Mich Talebzadeh <
 mich.talebza...@gmail.com> wrote:

> Assuming that you are using JDBC for putting data into any ACID
> compliant database (MSSQL, Sybase, Oracle etc), you are implicitly or
> explicitly  adding BEGIN TRAN to INSERT statement in a distributed
> transaction. MSSQL does not know or care where data is coming from. If 
> your
> connection completes OK a COMMIT TRAN will be sent and that will tell MSQL
> to commit transaction. If yoy kill Spark transaction before MSSQL receive
> COMMIT TRAN, the transaction will be rolled back.
>
> The only option is that if you don't care about full data getting to
> MSSQL,to break your insert into chunks at source and send data to MSSQL in
> small batches. In that way you will not lose all data in MSSQL because of
> rollback.
>
> HTH
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 20 April 2016 at 07:33, Mich Talebzadeh 
> wrote:
>
>> Are you using JDBC to push data to MSSQL?
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 19 April 2016 at 23:41, Andrés Ivaldi  wrote:
>>
>>> I mean local transaction, We've ran a Job that writes into SQLServer
>>> then we killed spark JVM just for testing purpose and we realized that
>>> SQLServer did a rollback.
>>>
>>> Regards
>>>
>>> On Tue, Apr 19, 2016 at 5:27 PM, Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
 Hi,

 What do you mean by *without transaction*? do you mean forcing SQL
 Server to accept a non logged operation?

 Dr Mich Talebzadeh



 LinkedIn * 
 https://www.linkedin.com/profile/

Re: Spark SQL Transaction

2016-04-20 Thread Mich Talebzadeh
yep. I think it is not possible to make SQL Server do a non logged
transaction. Other alternative is doing inserts in small batches if
possible. Or write to a CSV type file and use Bulk copy to load the file
into MSSQL with frequent commits like every 50K rows?



Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com



On 20 April 2016 at 19:42, Andrés Ivaldi  wrote:

> Yes, I know that behavior , but there is not explicit Begin Transaction in
> my code, so, maybe Spark or the same driver is adding the begin
> transaction, or implicit transaction is configured. If spark is'n adding a
> Begin transaction on each insertion, then probably is database or Driver
> configuration...
>
> On Wed, Apr 20, 2016 at 3:33 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>>
>> You will see what is happening in SQL Server. First create a test table
>> called  testme
>>
>> 1> use tempdb
>> 2> go
>> 1> create table testme(col1 int)
>> 2> go
>> -- Now explicitly begin a transaction and insert 1 row and select from
>> table
>> 1>
>> *begin tran*2> insert into testme values(1)
>> 3> select * from testme
>> 4> go
>> (1 row affected)
>>  col1
>>  ---
>>1
>> -- That value col1=1 is there
>> --
>> (1 row affected)
>> -- Now rollback that transaction meaning in your case by killing your
>> Spark process!
>> --
>> 1> rollback tran
>> 2> select * from testme
>> 3> go
>>  col1
>>  ---
>> (0 rows affected)
>>
>> -- You can see that record has gone as it rolled back!
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 20 April 2016 at 18:42, Andrés Ivaldi  wrote:
>>
>>> Sorry I'cant answer before, I want to know if spark is the responsible
>>> to add the Begin Tran, The point is to speed up insertion over losing data,
>>>  Disabling Transaction will speed up the insertion and we dont care about
>>> consistency... I'll disable te implicit_transaction and see what happens.
>>>
>>> thanks
>>>
>>> On Wed, Apr 20, 2016 at 12:09 PM, Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
 Assuming that you are using JDBC for putting data into any ACID
 compliant database (MSSQL, Sybase, Oracle etc), you are implicitly or
 explicitly  adding BEGIN TRAN to INSERT statement in a distributed
 transaction. MSSQL does not know or care where data is coming from. If your
 connection completes OK a COMMIT TRAN will be sent and that will tell MSQL
 to commit transaction. If yoy kill Spark transaction before MSSQL receive
 COMMIT TRAN, the transaction will be rolled back.

 The only option is that if you don't care about full data getting to
 MSSQL,to break your insert into chunks at source and send data to MSSQL in
 small batches. In that way you will not lose all data in MSSQL because of
 rollback.

 HTH

 Dr Mich Talebzadeh



 LinkedIn * 
 https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
 *



 http://talebzadehmich.wordpress.com



 On 20 April 2016 at 07:33, Mich Talebzadeh 
 wrote:

> Are you using JDBC to push data to MSSQL?
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 19 April 2016 at 23:41, Andrés Ivaldi  wrote:
>
>> I mean local transaction, We've ran a Job that writes into SQLServer
>> then we killed spark JVM just for testing purpose and we realized that
>> SQLServer did a rollback.
>>
>> Regards
>>
>> On Tue, Apr 19, 2016 at 5:27 PM, Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> What do you mean by *without transaction*? do you mean forcing SQL
>>> Server to accept a non logged operation?
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> *
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>> On 19 April 2016 at 21:18, Andrés Ivaldi  wrote:
>>>
 Hello, is possible to execute a SQL write without Transaction? we
 do

Re: Spark SQL Transaction

2016-04-20 Thread Andrés Ivaldi
Yes, I know that behavior , but there is not explicit Begin Transaction in
my code, so, maybe Spark or the same driver is adding the begin
transaction, or implicit transaction is configured. If spark is'n adding a
Begin transaction on each insertion, then probably is database or Driver
configuration...

On Wed, Apr 20, 2016 at 3:33 PM, Mich Talebzadeh 
wrote:

>
> You will see what is happening in SQL Server. First create a test table
> called  testme
>
> 1> use tempdb
> 2> go
> 1> create table testme(col1 int)
> 2> go
> -- Now explicitly begin a transaction and insert 1 row and select from
> table
> 1>
> *begin tran*2> insert into testme values(1)
> 3> select * from testme
> 4> go
> (1 row affected)
>  col1
>  ---
>1
> -- That value col1=1 is there
> --
> (1 row affected)
> -- Now rollback that transaction meaning in your case by killing your
> Spark process!
> --
> 1> rollback tran
> 2> select * from testme
> 3> go
>  col1
>  ---
> (0 rows affected)
>
> -- You can see that record has gone as it rolled back!
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 20 April 2016 at 18:42, Andrés Ivaldi  wrote:
>
>> Sorry I'cant answer before, I want to know if spark is the responsible to
>> add the Begin Tran, The point is to speed up insertion over losing data,
>>  Disabling Transaction will speed up the insertion and we dont care about
>> consistency... I'll disable te implicit_transaction and see what happens.
>>
>> thanks
>>
>> On Wed, Apr 20, 2016 at 12:09 PM, Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> Assuming that you are using JDBC for putting data into any ACID
>>> compliant database (MSSQL, Sybase, Oracle etc), you are implicitly or
>>> explicitly  adding BEGIN TRAN to INSERT statement in a distributed
>>> transaction. MSSQL does not know or care where data is coming from. If your
>>> connection completes OK a COMMIT TRAN will be sent and that will tell MSQL
>>> to commit transaction. If yoy kill Spark transaction before MSSQL receive
>>> COMMIT TRAN, the transaction will be rolled back.
>>>
>>> The only option is that if you don't care about full data getting to
>>> MSSQL,to break your insert into chunks at source and send data to MSSQL in
>>> small batches. In that way you will not lose all data in MSSQL because of
>>> rollback.
>>>
>>> HTH
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> *
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>> On 20 April 2016 at 07:33, Mich Talebzadeh 
>>> wrote:
>>>
 Are you using JDBC to push data to MSSQL?

 Dr Mich Talebzadeh



 LinkedIn * 
 https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
 *



 http://talebzadehmich.wordpress.com



 On 19 April 2016 at 23:41, Andrés Ivaldi  wrote:

> I mean local transaction, We've ran a Job that writes into SQLServer
> then we killed spark JVM just for testing purpose and we realized that
> SQLServer did a rollback.
>
> Regards
>
> On Tue, Apr 19, 2016 at 5:27 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> Hi,
>>
>> What do you mean by *without transaction*? do you mean forcing SQL
>> Server to accept a non logged operation?
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 19 April 2016 at 21:18, Andrés Ivaldi  wrote:
>>
>>> Hello, is possible to execute a SQL write without Transaction? we
>>> dont need transactions to save our data and this adds an overhead to the
>>> SQLServer.
>>>
>>> Regards.
>>>
>>> --
>>> Ing. Ivaldi Andres
>>>
>>
>>
>
>
> --
> Ing. Ivaldi Andres
>


>>>
>>
>>
>> --
>> Ing. Ivaldi Andres
>>
>
>
>


-- 
Ing. Ivaldi Andres


Fwd: Spark SQL Transaction

2016-04-20 Thread Mich Talebzadeh
You will see what is happening in SQL Server. First create a test table
called  testme

1> use tempdb
2> go
1> create table testme(col1 int)
2> go
-- Now explicitly begin a transaction and insert 1 row and select from table
1>
*begin tran*2> insert into testme values(1)
3> select * from testme
4> go
(1 row affected)
 col1
 ---
   1
-- That value col1=1 is there
--
(1 row affected)
-- Now rollback that transaction meaning in your case by killing your Spark
process!
--
1> rollback tran
2> select * from testme
3> go
 col1
 ---
(0 rows affected)

-- You can see that record has gone as it rolled back!


Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com



On 20 April 2016 at 18:42, Andrés Ivaldi  wrote:

> Sorry I'cant answer before, I want to know if spark is the responsible to
> add the Begin Tran, The point is to speed up insertion over losing data,
>  Disabling Transaction will speed up the insertion and we dont care about
> consistency... I'll disable te implicit_transaction and see what happens.
>
> thanks
>
> On Wed, Apr 20, 2016 at 12:09 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> Assuming that you are using JDBC for putting data into any ACID compliant
>> database (MSSQL, Sybase, Oracle etc), you are implicitly or explicitly
>>  adding BEGIN TRAN to INSERT statement in a distributed transaction. MSSQL
>> does not know or care where data is coming from. If your connection
>> completes OK a COMMIT TRAN will be sent and that will tell MSQL to commit
>> transaction. If yoy kill Spark transaction before MSSQL receive COMMIT
>> TRAN, the transaction will be rolled back.
>>
>> The only option is that if you don't care about full data getting to
>> MSSQL,to break your insert into chunks at source and send data to MSSQL in
>> small batches. In that way you will not lose all data in MSSQL because of
>> rollback.
>>
>> HTH
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 20 April 2016 at 07:33, Mich Talebzadeh 
>> wrote:
>>
>>> Are you using JDBC to push data to MSSQL?
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> *
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>> On 19 April 2016 at 23:41, Andrés Ivaldi  wrote:
>>>
 I mean local transaction, We've ran a Job that writes into SQLServer
 then we killed spark JVM just for testing purpose and we realized that
 SQLServer did a rollback.

 Regards

 On Tue, Apr 19, 2016 at 5:27 PM, Mich Talebzadeh <
 mich.talebza...@gmail.com> wrote:

> Hi,
>
> What do you mean by *without transaction*? do you mean forcing SQL
> Server to accept a non logged operation?
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 19 April 2016 at 21:18, Andrés Ivaldi  wrote:
>
>> Hello, is possible to execute a SQL write without Transaction? we
>> dont need transactions to save our data and this adds an overhead to the
>> SQLServer.
>>
>> Regards.
>>
>> --
>> Ing. Ivaldi Andres
>>
>
>


 --
 Ing. Ivaldi Andres

>>>
>>>
>>
>
>
> --
> Ing. Ivaldi Andres
>


Re: Spark SQL Transaction

2016-04-20 Thread Andrés Ivaldi
Sorry I'cant answer before, I want to know if spark is the responsible to
add the Begin Tran, The point is to speed up insertion over losing data,
 Disabling Transaction will speed up the insertion and we dont care about
consistency... I'll disable te implicit_transaction and see what happens.

thanks

On Wed, Apr 20, 2016 at 12:09 PM, Mich Talebzadeh  wrote:

> Assuming that you are using JDBC for putting data into any ACID compliant
> database (MSSQL, Sybase, Oracle etc), you are implicitly or explicitly
>  adding BEGIN TRAN to INSERT statement in a distributed transaction. MSSQL
> does not know or care where data is coming from. If your connection
> completes OK a COMMIT TRAN will be sent and that will tell MSQL to commit
> transaction. If yoy kill Spark transaction before MSSQL receive COMMIT
> TRAN, the transaction will be rolled back.
>
> The only option is that if you don't care about full data getting to
> MSSQL,to break your insert into chunks at source and send data to MSSQL in
> small batches. In that way you will not lose all data in MSSQL because of
> rollback.
>
> HTH
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 20 April 2016 at 07:33, Mich Talebzadeh 
> wrote:
>
>> Are you using JDBC to push data to MSSQL?
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 19 April 2016 at 23:41, Andrés Ivaldi  wrote:
>>
>>> I mean local transaction, We've ran a Job that writes into SQLServer
>>> then we killed spark JVM just for testing purpose and we realized that
>>> SQLServer did a rollback.
>>>
>>> Regards
>>>
>>> On Tue, Apr 19, 2016 at 5:27 PM, Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
 Hi,

 What do you mean by *without transaction*? do you mean forcing SQL
 Server to accept a non logged operation?

 Dr Mich Talebzadeh



 LinkedIn * 
 https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
 *



 http://talebzadehmich.wordpress.com



 On 19 April 2016 at 21:18, Andrés Ivaldi  wrote:

> Hello, is possible to execute a SQL write without Transaction? we dont
> need transactions to save our data and this adds an overhead to the
> SQLServer.
>
> Regards.
>
> --
> Ing. Ivaldi Andres
>


>>>
>>>
>>> --
>>> Ing. Ivaldi Andres
>>>
>>
>>
>


-- 
Ing. Ivaldi Andres


Re: Spark SQL Transaction

2016-04-20 Thread Mich Talebzadeh
Assuming that you are using JDBC for putting data into any ACID compliant
database (MSSQL, Sybase, Oracle etc), you are implicitly or explicitly
 adding BEGIN TRAN to INSERT statement in a distributed transaction. MSSQL
does not know or care where data is coming from. If your connection
completes OK a COMMIT TRAN will be sent and that will tell MSQL to commit
transaction. If yoy kill Spark transaction before MSSQL receive COMMIT
TRAN, the transaction will be rolled back.

The only option is that if you don't care about full data getting to
MSSQL,to break your insert into chunks at source and send data to MSSQL in
small batches. In that way you will not lose all data in MSSQL because of
rollback.

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com



On 20 April 2016 at 07:33, Mich Talebzadeh 
wrote:

> Are you using JDBC to push data to MSSQL?
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 19 April 2016 at 23:41, Andrés Ivaldi  wrote:
>
>> I mean local transaction, We've ran a Job that writes into SQLServer then
>> we killed spark JVM just for testing purpose and we realized that SQLServer
>> did a rollback.
>>
>> Regards
>>
>> On Tue, Apr 19, 2016 at 5:27 PM, Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> What do you mean by *without transaction*? do you mean forcing SQL
>>> Server to accept a non logged operation?
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> *
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>> On 19 April 2016 at 21:18, Andrés Ivaldi  wrote:
>>>
 Hello, is possible to execute a SQL write without Transaction? we dont
 need transactions to save our data and this adds an overhead to the
 SQLServer.

 Regards.

 --
 Ing. Ivaldi Andres

>>>
>>>
>>
>>
>> --
>> Ing. Ivaldi Andres
>>
>
>


Re: Spark SQL Transaction

2016-04-19 Thread Mich Talebzadeh
Are you using JDBC to push data to MSSQL?

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com



On 19 April 2016 at 23:41, Andrés Ivaldi  wrote:

> I mean local transaction, We've ran a Job that writes into SQLServer then
> we killed spark JVM just for testing purpose and we realized that SQLServer
> did a rollback.
>
> Regards
>
> On Tue, Apr 19, 2016 at 5:27 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> Hi,
>>
>> What do you mean by *without transaction*? do you mean forcing SQL Server
>> to accept a non logged operation?
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 19 April 2016 at 21:18, Andrés Ivaldi  wrote:
>>
>>> Hello, is possible to execute a SQL write without Transaction? we dont
>>> need transactions to save our data and this adds an overhead to the
>>> SQLServer.
>>>
>>> Regards.
>>>
>>> --
>>> Ing. Ivaldi Andres
>>>
>>
>>
>
>
> --
> Ing. Ivaldi Andres
>


Re: Spark SQL Transaction

2016-04-19 Thread Andrés Ivaldi
I mean local transaction, We've ran a Job that writes into SQLServer then
we killed spark JVM just for testing purpose and we realized that SQLServer
did a rollback.

Regards

On Tue, Apr 19, 2016 at 5:27 PM, Mich Talebzadeh 
wrote:

> Hi,
>
> What do you mean by *without transaction*? do you mean forcing SQL Server
> to accept a non logged operation?
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 19 April 2016 at 21:18, Andrés Ivaldi  wrote:
>
>> Hello, is possible to execute a SQL write without Transaction? we dont
>> need transactions to save our data and this adds an overhead to the
>> SQLServer.
>>
>> Regards.
>>
>> --
>> Ing. Ivaldi Andres
>>
>
>


-- 
Ing. Ivaldi Andres


Re: Spark SQL Transaction

2016-04-19 Thread Mich Talebzadeh
Hi,

What do you mean by *without transaction*? do you mean forcing SQL Server
to accept a non logged operation?

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com



On 19 April 2016 at 21:18, Andrés Ivaldi  wrote:

> Hello, is possible to execute a SQL write without Transaction? we dont
> need transactions to save our data and this adds an overhead to the
> SQLServer.
>
> Regards.
>
> --
> Ing. Ivaldi Andres
>


Spark SQL Transaction

2016-04-19 Thread Andrés Ivaldi
Hello, is possible to execute a SQL write without Transaction? we dont need
transactions to save our data and this adds an overhead to the SQLServer.

Regards.

-- 
Ing. Ivaldi Andres