[
https://issues.apache.org/jira/browse/SPARK-12218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15048619#comment-15048619
]
Irakli Machabeli edited comment on SPARK-12218 at 12/9/15 2:40 PM:
-------------------------------------------------------------------
Below is the explain plan.
To make it clear, query that contains not (A and B) :
{code}
and not( PaymentsReceived=0 and ExplicitRoll in ('PreviouslyPaidOff',
'PreviouslyChargedOff'))
{code}
produces wrong results,
and query that is already expanded as (not A) or (not B) produces correct
output.
Physical plan look like this:
{code}
wrong results-- Filter ((LoanID#8629 = 62231) && NOT ((PaymentsReceived#8642 =
0.0) && ExplicitRoll#8652 IN (PreviouslyPaidOff,PreviouslyChargedOff)))
correct results -- Filter ((LoanID#8803 = 62231) && (NOT (PaymentsReceived#8816
= 0.0) || NOT ExplicitRoll#8826 IN (PreviouslyPaidOff,PreviouslyChargedOff)))
{code}
Explain plan results:
{code}
Wrong:
In [15]: sqlContext.read.parquet('prp_enh1').where(" LoanID=62231 and not(
PaymentsReceived=0.0 and ExplicitRoll in ('PreviouslyPaidOff',
'PreviouslyChargedOff'))").explain(True)
{code}
{noformat}
== Parsed Logical Plan ==
'Filter (('LoanID = 62231) && NOT (('PaymentsReceived = 0.0) && 'ExplicitRoll
IN (PreviouslyPaidOff,PreviouslyChargedOff)))
Relation[BorrowerRate#8625,MnthRate#8626,ObservationMonth#8627,CycleCounter#8628,LoanID#8629,Loankey#8630,OriginationDate#8631,OriginationQuarter#8632,LoanAmount#8633,Term#8634,LenderRate#8635,ProsperRating#8636,ScheduledMonthlyPaymentAmount#8637,ChargeoffMonth#8638,ChargeoffAmount#8639,CompletedMonth#8640,MonthOfLastPayment#8641,PaymentsReceived#8642,CollectionFees#8643,PrincipalPaid#8644,InterestPaid#8645,LateFees#8646,ServicingFees#8647,RecoveryPayments#8648,RecoveryPrin#8649,DaysPastDue#8650,PriorMonthDPD#8651,ExplicitRoll#8652,SummaryRoll#8653,CumulPrin#8654,EOMPrin#8655,ScheduledPrinRemaining#8656,ScheduledCumulPrin#8657,ScheduledPeriodicPrin#8658,BOMPrin#8659,ListingNumber#8660,DebtSaleMonth#8661,GrossCashFromDebtSale#8662,DebtSaleFee#8663,NetCashToInvestorsFromDebtSale#8664,OZVintage#8665]
ParquetRelation[file:/d:/MktLending/prp_enh1]
== Analyzed Logical Plan ==
BorrowerRate: double, MnthRate: double, ObservationMonth: date, CycleCounter:
int, LoanID: int, Loankey: string, OriginationDate: date, OriginationQuarter:
string, LoanAmount: double, Term: int, LenderRate: double, ProsperRating:
string, ScheduledMonthlyPaymentAmount: double, ChargeoffMonth: date,
ChargeoffAmount: double, CompletedMonth: date, MonthOfLastPayment: date,
PaymentsReceived: double, CollectionFees: double, PrincipalPaid: double,
InterestPaid: double, LateFees: double, ServicingFees: double,
RecoveryPayments: double, RecoveryPrin: double, DaysPastDue: int,
PriorMonthDPD: int, ExplicitRoll: string, SummaryRoll: string, CumulPrin:
double, EOMPrin: double, ScheduledPrinRemaining: double, ScheduledCumulPrin:
double, ScheduledPeriodicPrin: double, BOMPrin: double, ListingNumber: int,
DebtSaleMonth: int, GrossCashFromDebtSale: double, DebtSaleFee: double,
NetCashToInvestorsFromDebtSale: double, OZVintage: string
Filter ((LoanID#8629 = 62231) && NOT ((PaymentsReceived#8642 = cast(0.0 as
double)) && ExplicitRoll#8652 IN (PreviouslyPaidOff,PreviouslyChargedOff)))
Relation[BorrowerRate#8625,MnthRate#8626,ObservationMonth#8627,CycleCounter#8628,LoanID#8629,Loankey#8630,OriginationDate#8631,OriginationQuarter#8632,LoanAmount#8633,Term#8634,LenderRate#8635,ProsperRating#8636,ScheduledMonthlyPaymentAmount#8637,ChargeoffMonth#8638,ChargeoffAmount#8639,CompletedMonth#8640,MonthOfLastPayment#8641,PaymentsReceived#8642,CollectionFees#8643,PrincipalPaid#8644,InterestPaid#8645,LateFees#8646,ServicingFees#8647,RecoveryPayments#8648,RecoveryPrin#8649,DaysPastDue#8650,PriorMonthDPD#8651,ExplicitRoll#8652,SummaryRoll#8653,CumulPrin#8654,EOMPrin#8655,ScheduledPrinRemaining#8656,ScheduledCumulPrin#8657,ScheduledPeriodicPrin#8658,BOMPrin#8659,ListingNumber#8660,DebtSaleMonth#8661,GrossCashFromDebtSale#8662,DebtSaleFee#8663,NetCashToInvestorsFromDebtSale#8664,OZVintage#8665]
ParquetRelation[file:/d:/MktLending/prp_enh1]
== Optimized Logical Plan ==
Filter ((LoanID#8629 = 62231) && NOT ((PaymentsReceived#8642 = 0.0) &&
ExplicitRoll#8652 IN (PreviouslyPaidOff,PreviouslyChargedOff)))
Relation[BorrowerRate#8625,MnthRate#8626,ObservationMonth#8627,CycleCounter#8628,LoanID#8629,Loankey#8630,OriginationDate#8631,OriginationQuarter#8632,LoanAmount#8633,Term#8634,LenderRate#8635,ProsperRating#8636,ScheduledMonthlyPaymentAmount#8637,ChargeoffMonth#8638,ChargeoffAmount#8639,CompletedMonth#8640,MonthOfLastPayment#8641,PaymentsReceived#8642,CollectionFees#8643,PrincipalPaid#8644,InterestPaid#8645,LateFees#8646,ServicingFees#8647,RecoveryPayments#8648,RecoveryPrin#8649,DaysPastDue#8650,PriorMonthDPD#8651,ExplicitRoll#8652,SummaryRoll#8653,CumulPrin#8654,EOMPrin#8655,ScheduledPrinRemaining#8656,ScheduledCumulPrin#8657,ScheduledPeriodicPrin#8658,BOMPrin#8659,ListingNumber#8660,DebtSaleMonth#8661,GrossCashFromDebtSale#8662,DebtSaleFee#8663,NetCashToInvestorsFromDebtSale#8664,OZVintage#8665]
ParquetRelation[file:/d:/MktLending/prp_enh1]
== Physical Plan ==
Filter ((LoanID#8629 = 62231) && NOT ((PaymentsReceived#8642 = 0.0) &&
ExplicitRoll#8652 IN (PreviouslyPaidOff,PreviouslyChargedOff)))
Scan
ParquetRelation[file:/d:/MktLending/prp_enh1][BorrowerRate#8625,MnthRate#8626,ObservationMonth#8627,CycleCounter#8628,LoanID#8629,Loankey#8630,OriginationDate#8631,OriginationQuarter#8632,LoanAmount#8633,Term#8634,LenderRate#8635,ProsperRating#8636,ScheduledMonthlyPaymentAmount#8637,ChargeoffMonth#8638,ChargeoffAmount#8639,CompletedMonth#8640,MonthOfLastPayment#8641,PaymentsReceived#8642,CollectionFees#8643,PrincipalPaid#8644,InterestPaid#8645,LateFees#8646,ServicingFees#8647,RecoveryPayments#8648,RecoveryPrin#8649,DaysPastDue#8650,PriorMonthDPD#8651,ExplicitRoll#8652,SummaryRoll#8653,CumulPrin#8654,EOMPrin#8655,ScheduledPrinRemaining#8656,ScheduledCumulPrin#8657,ScheduledPeriodicPrin#8658,BOMPrin#8659,ListingNumber#8660,DebtSaleMonth#8661,GrossCashFromDebtSale#8662,DebtSaleFee#8663,NetCashToInvestorsFromDebtSale#8664,OZVintage#8665]
Code Generation: true
{noformat}
{code}
Correct:
In [19]: sqlContext.read.parquet('prp_enh1').where(" LoanID=62231 and (
not(PaymentsReceived=0) or not (ExplicitRoll in ('PreviouslyPaidOff',
'PreviouslyChargedOff')))").explain(True)
{code}
{noformat}
In [19]: sqlContext.read.parquet('prp_enh1').where(" LoanID=62231 and (
not(PaymentsReceived=0) or not (ExplicitRoll in ('PreviouslyPaidOff',
'PreviouslyChargedOff')))").explain(True)
== Parsed Logical Plan ==
'Filter (('LoanID = 62231) && (NOT ('PaymentsReceived = 0) || NOT 'ExplicitRoll
IN (PreviouslyPaidOff,PreviouslyChargedOff)))
Relation[BorrowerRate#8799,MnthRate#8800,ObservationMonth#8801,CycleCounter#8802,LoanID#8803,Loankey#8804,OriginationDate#8805,OriginationQuarter#8806,LoanAmount#8807,Term#8808,LenderRate#8809,ProsperRating#8810,ScheduledMonthlyPaymentAmount#8811,ChargeoffMonth#8812,ChargeoffAmount#8813,CompletedMonth#8814,MonthOfLastPayment#8815,PaymentsReceived#8816,CollectionFees#8817,PrincipalPaid#8818,InterestPaid#8819,LateFees#8820,ServicingFees#8821,RecoveryPayments#8822,RecoveryPrin#8823,DaysPastDue#8824,PriorMonthDPD#8825,ExplicitRoll#8826,SummaryRoll#8827,CumulPrin#8828,EOMPrin#8829,ScheduledPrinRemaining#8830,ScheduledCumulPrin#8831,ScheduledPeriodicPrin#8832,BOMPrin#8833,ListingNumber#8834,DebtSaleMonth#8835,GrossCashFromDebtSale#8836,DebtSaleFee#8837,NetCashToInvestorsFromDebtSale#8838,OZVintage#8839]
ParquetRelation[file:/d:/MktLending/prp_enh1]
== Analyzed Logical Plan ==
BorrowerRate: double, MnthRate: double, ObservationMonth: date, CycleCounter:
int, LoanID: int, Loankey: string, OriginationDate: date, OriginationQuarter:
string, LoanAmount: double, Term: int, LenderRate: double, ProsperRating:
string, ScheduledMonthlyPaymentAmount: double, ChargeoffMonth: date,
ChargeoffAmount: double, CompletedMonth: date, MonthOfLastPayment: date,
PaymentsReceived: double, CollectionFees: double, PrincipalPaid: double,
InterestPaid: double, LateFees: double, ServicingFees: double,
RecoveryPayments: double, RecoveryPrin: double, DaysPastDue: int,
PriorMonthDPD: int, ExplicitRoll: string, SummaryRoll: string, CumulPrin:
double, EOMPrin: double, ScheduledPrinRemaining: double, ScheduledCumulPrin:
double, ScheduledPeriodicPrin: double, BOMPrin: double, ListingNumber: int,
DebtSaleMonth: int, GrossCashFromDebtSale: double, DebtSaleFee: double,
NetCashToInvestorsFromDebtSale: double, OZVintage: string
Filter ((LoanID#8803 = 62231) && (NOT (PaymentsReceived#8816 = cast(0 as
double)) || NOT ExplicitRoll#8826 IN (PreviouslyPaidOff,PreviouslyChargedOff)))
Relation[BorrowerRate#8799,MnthRate#8800,ObservationMonth#8801,CycleCounter#8802,LoanID#8803,Loankey#8804,OriginationDate#8805,OriginationQuarter#8806,LoanAmount#8807,Term#8808,LenderRate#8809,ProsperRating#8810,ScheduledMonthlyPaymentAmount#8811,ChargeoffMonth#8812,ChargeoffAmount#8813,CompletedMonth#8814,MonthOfLastPayment#8815,PaymentsReceived#8816,CollectionFees#8817,PrincipalPaid#8818,InterestPaid#8819,LateFees#8820,ServicingFees#8821,RecoveryPayments#8822,RecoveryPrin#8823,DaysPastDue#8824,PriorMonthDPD#8825,ExplicitRoll#8826,SummaryRoll#8827,CumulPrin#8828,EOMPrin#8829,ScheduledPrinRemaining#8830,ScheduledCumulPrin#8831,ScheduledPeriodicPrin#8832,BOMPrin#8833,ListingNumber#8834,DebtSaleMonth#8835,GrossCashFromDebtSale#8836,DebtSaleFee#8837,NetCashToInvestorsFromDebtSale#8838,OZVintage#8839]
ParquetRelation[file:/d:/MktLending/prp_enh1]
== Optimized Logical Plan ==
Filter ((LoanID#8803 = 62231) && (NOT (PaymentsReceived#8816 = 0.0) || NOT
ExplicitRoll#8826 IN (PreviouslyPaidOff,PreviouslyChargedOff)))
Relation[BorrowerRate#8799,MnthRate#8800,ObservationMonth#8801,CycleCounter#8802,LoanID#8803,Loankey#8804,OriginationDate#8805,OriginationQuarter#8806,LoanAmount#8807,Term#8808,LenderRate#8809,ProsperRating#8810,ScheduledMonthlyPaymentAmount#8811,ChargeoffMonth#8812,ChargeoffAmount#8813,CompletedMonth#8814,MonthOfLastPayment#8815,PaymentsReceived#8816,CollectionFees#8817,PrincipalPaid#8818,InterestPaid#8819,LateFees#8820,ServicingFees#8821,RecoveryPayments#8822,RecoveryPrin#8823,DaysPastDue#8824,PriorMonthDPD#8825,ExplicitRoll#8826,SummaryRoll#8827,CumulPrin#8828,EOMPrin#8829,ScheduledPrinRemaining#8830,ScheduledCumulPrin#8831,ScheduledPeriodicPrin#8832,BOMPrin#8833,ListingNumber#8834,DebtSaleMonth#8835,GrossCashFromDebtSale#8836,DebtSaleFee#8837,NetCashToInvestorsFromDebtSale#8838,OZVintage#8839]
ParquetRelation[file:/d:/MktLending/prp_enh1]
== Physical Plan ==
Filter ((LoanID#8803 = 62231) && (NOT (PaymentsReceived#8816 = 0.0) || NOT
ExplicitRoll#8826 IN (PreviouslyPaidOff,PreviouslyChargedOff)))
Scan
ParquetRelation[file:/d:/MktLending/prp_enh1][BorrowerRate#8799,MnthRate#8800,ObservationMonth#8801,CycleCounter#8802,LoanID#8803,Loankey#8804,OriginationDate#8805,OriginationQuarter#8806,LoanAmount#8807,Term#8808,LenderRate#8809,ProsperRating#8810,ScheduledMonthlyPaymentAmount#8811,ChargeoffMonth#8812,ChargeoffAmount#8813,CompletedMonth#8814,MonthOfLastPayment#8815,PaymentsReceived#8816,CollectionFees#8817,PrincipalPaid#8818,InterestPaid#8819,LateFees#8820,ServicingFees#8821,RecoveryPayments#8822,RecoveryPrin#8823,DaysPastDue#8824,PriorMonthDPD#8825,ExplicitRoll#8826,SummaryRoll#8827,CumulPrin#8828,EOMPrin#8829,ScheduledPrinRemaining#8830,ScheduledCumulPrin#8831,ScheduledPeriodicPrin#8832,BOMPrin#8833,ListingNumber#8834,DebtSaleMonth#8835,GrossCashFromDebtSale#8836,DebtSaleFee#8837,NetCashToInvestorsFromDebtSale#8838,OZVintage#8839]
Code Generation: true
{noformat}
was (Author: imachabeli):
Below is the explain plan.
To make it clear, query that contains not (A and B) :
{code}
and not( PaymentsReceived=0 and ExplicitRoll in ('PreviouslyPaidOff',
'PreviouslyChargedOff'))
{code}
produces wrong results,
and query that is already expanded as (not A) or (not B) produces correct
output.
By the way I saw in explain plan cast(0 as double)) so I tried to change 0 =>
0.0 but no difference.
Physical plan looks similar:
{code}
'Filter (('LoanID = 62231) && (NOT ('PaymentsReceived = 0) || NOT 'ExplicitRoll
IN (PreviouslyPaidOff,PreviouslyChargedOff)))
Filter ((LoanID#8588 = 62231) && (NOT (PaymentsReceived#8601 = 0.0) || NOT
ExplicitRoll#8611 IN (PreviouslyPaidOff,PreviouslyChargedOff)))
{code}
Explain plan results:
{code}
In [13]: sqlContext.read.parquet('prp_enh1').where(" LoanID=62231 and not(
PaymentsReceived=0 and ExplicitRoll in ('PreviouslyPaidOff',
'PreviouslyChargedOff'))").explain(True)
{code}
{noformat}
== Parsed Logical Plan ==
'Filter (('LoanID = 62231) && NOT (('PaymentsReceived = 0) && 'ExplicitRoll IN
(PreviouslyPaidOff,PreviouslyChargedOff)))
Relation[BorrowerRate#8543,MnthRate#8544,ObservationMonth#8545,CycleCounter#8546,LoanID#8547,Loankey#8548,OriginationDate#8549,OriginationQuarter#8550,LoanAmount#8551,Term#8552,LenderRate#8553,ProsperRating#8554,ScheduledMonthlyPaymentAmount#8555,ChargeoffMonth#8556,ChargeoffAmount#8557,CompletedMonth#8558,MonthOfLastPayment#8559,PaymentsReceived#8560,CollectionFees#8561,PrincipalPaid#8562,InterestPaid#8563,LateFees#8564,ServicingFees#8565,RecoveryPayments#8566,RecoveryPrin#8567,DaysPastDue#8568,PriorMonthDPD#8569,ExplicitRoll#8570,SummaryRoll#8571,CumulPrin#8572,EOMPrin#8573,ScheduledPrinRemaining#8574,ScheduledCumulPrin#8575,ScheduledPeriodicPrin#8576,BOMPrin#8577,ListingNumber#8578,DebtSaleMonth#8579,GrossCashFromDebtSale#8580,DebtSaleFee#8581,NetCashToInvestorsFromDebtSale#8582,OZVintage#8583]
ParquetRelation[file:/d:/MktLending/prp_enh1]
== Analyzed Logical Plan ==
BorrowerRate: double, MnthRate: double, ObservationMonth: date, CycleCounter:
int, LoanID: int, Loankey: string, OriginationDate: date, OriginationQuarter:
string, LoanAmount: double, Term: int, LenderRate: double, ProsperRating:
string, ScheduledMonthlyPaymentAmount: double, ChargeoffMonth: date,
ChargeoffAmount: double, CompletedMonth: date, MonthOfLastPayment: date,
PaymentsReceived: double, CollectionFees: double, PrincipalPaid: double,
InterestPaid: double, LateFees: double, ServicingFees: double,
RecoveryPayments: double, RecoveryPrin: double, DaysPastDue: int,
PriorMonthDPD: int, ExplicitRoll: string, SummaryRoll: string, CumulPrin:
double, EOMPrin: double, ScheduledPrinRemaining: double, ScheduledCumulPrin:
double, ScheduledPeriodicPrin: double, BOMPrin: double, ListingNumber: int,
DebtSaleMonth: int, GrossCashFromDebtSale: double, DebtSaleFee: double,
NetCashToInvestorsFromDebtSale: double, OZVintage: string
Filter ((LoanID#8547 = 62231) && NOT ((PaymentsReceived#8560 = cast(0 as
double)) && ExplicitRoll#8570 IN (PreviouslyPaidOff,PreviouslyChargedOff)))
Relation[BorrowerRate#8543,MnthRate#8544,ObservationMonth#8545,CycleCounter#8546,LoanID#8547,Loankey#8548,OriginationDate#8549,OriginationQuarter#8550,LoanAmount#8551,Term#8552,LenderRate#8553,ProsperRating#8554,ScheduledMonthlyPaymentAmount#8555,ChargeoffMonth#8556,ChargeoffAmount#8557,CompletedMonth#8558,MonthOfLastPayment#8559,PaymentsReceived#8560,CollectionFees#8561,PrincipalPaid#8562,InterestPaid#8563,LateFees#8564,ServicingFees#8565,RecoveryPayments#8566,RecoveryPrin#8567,DaysPastDue#8568,PriorMonthDPD#8569,ExplicitRoll#8570,SummaryRoll#8571,CumulPrin#8572,EOMPrin#8573,ScheduledPrinRemaining#8574,ScheduledCumulPrin#8575,ScheduledPeriodicPrin#8576,BOMPrin#8577,ListingNumber#8578,DebtSaleMonth#8579,GrossCashFromDebtSale#8580,DebtSaleFee#8581,NetCashToInvestorsFromDebtSale#8582,OZVintage#8583]
ParquetRelation[file:/d:/MktLending/prp_enh1]
== Optimized Logical Plan ==
Filter ((LoanID#8547 = 62231) && NOT ((PaymentsReceived#8560 = 0.0) &&
ExplicitRoll#8570 IN (PreviouslyPaidOff,PreviouslyChargedOff)))
Relation[BorrowerRate#8543,MnthRate#8544,ObservationMonth#8545,CycleCounter#8546,LoanID#8547,Loankey#8548,OriginationDate#8549,OriginationQuarter#8550,LoanAmount#8551,Term#8552,LenderRate#8553,ProsperRating#8554,ScheduledMonthlyPaymentAmount#8555,ChargeoffMonth#8556,ChargeoffAmount#8557,CompletedMonth#8558,MonthOfLastPayment#8559,PaymentsReceived#8560,CollectionFees#8561,PrincipalPaid#8562,InterestPaid#8563,LateFees#8564,ServicingFees#8565,RecoveryPayments#8566,RecoveryPrin#8567,DaysPastDue#8568,PriorMonthDPD#8569,ExplicitRoll#8570,SummaryRoll#8571,CumulPrin#8572,EOMPrin#8573,ScheduledPrinRemaining#8574,ScheduledCumulPrin#8575,ScheduledPeriodicPrin#8576,BOMPrin#8577,ListingNumber#8578,DebtSaleMonth#8579,GrossCashFromDebtSale#8580,DebtSaleFee#8581,NetCashToInvestorsFromDebtSale#8582,OZVintage#8583]
ParquetRelation[file:/d:/MktLending/prp_enh1]
== Physical Plan ==
Filter ((LoanID#8547 = 62231) && NOT ((PaymentsReceived#8560 = 0.0) &&
ExplicitRoll#8570 IN (PreviouslyPaidOff,PreviouslyChargedOff)))
Scan
ParquetRelation[file:/d:/MktLending/prp_enh1][BorrowerRate#8543,MnthRate#8544,ObservationMonth#8545,CycleCounter#8546,LoanID#8547,Loankey#8548,OriginationDate#8549,OriginationQuarter#8550,LoanAmount#8551,Term#8552,LenderRate#8553,ProsperRating#8554,ScheduledMonthlyPaymentAmount#8555,ChargeoffMonth#8556,ChargeoffAmount#8557,CompletedMonth#8558,MonthOfLastPayment#8559,PaymentsReceived#8560,CollectionFees#8561,PrincipalPaid#8562,InterestPaid#8563,LateFees#8564,ServicingFees#8565,RecoveryPayments#8566,RecoveryPrin#8567,DaysPastDue#8568,PriorMonthDPD#8569,ExplicitRoll#8570,SummaryRoll#8571,CumulPrin#8572,EOMPrin#8573,ScheduledPrinRemaining#8574,ScheduledCumulPrin#8575,ScheduledPeriodicPrin#8576,BOMPrin#8577,ListingNumber#8578,DebtSaleMonth#8579,GrossCashFromDebtSale#8580,DebtSaleFee#8581,NetCashToInvestorsFromDebtSale#8582,OZVintage#8583]
Code Generation: true
{noformat}
{code}
In [14]: sqlContext.read.parquet('prp_enh1').where(" LoanID=62231 and (
not(PaymentsReceived=0) or not (ExplicitRoll in ('PreviouslyPaidOff',
'PreviouslyChargedOff')))").explain(True)
{code}
{noformat}
== Parsed Logical Plan ==
'Filter (('LoanID = 62231) && (NOT ('PaymentsReceived = 0) || NOT 'ExplicitRoll
IN (PreviouslyPaidOff,PreviouslyChargedOff)))
Relation[BorrowerRate#8584,MnthRate#8585,ObservationMonth#8586,CycleCounter#8587,LoanID#8588,Loankey#8589,OriginationDate#8590,OriginationQuarter#8591,LoanAmount#8592,Term#8593,LenderRate#8594,ProsperRating#8595,ScheduledMonthlyPaymentAmount#8596,ChargeoffMonth#8597,ChargeoffAmount#8598,CompletedMonth#8599,MonthOfLastPayment#8600,PaymentsReceived#8601,CollectionFees#8602,PrincipalPaid#8603,InterestPaid#8604,LateFees#8605,ServicingFees#8606,RecoveryPayments#8607,RecoveryPrin#8608,DaysPastDue#8609,PriorMonthDPD#8610,ExplicitRoll#8611,SummaryRoll#8612,CumulPrin#8613,EOMPrin#8614,ScheduledPrinRemaining#8615,ScheduledCumulPrin#8616,ScheduledPeriodicPrin#8617,BOMPrin#8618,ListingNumber#8619,DebtSaleMonth#8620,GrossCashFromDebtSale#8621,DebtSaleFee#8622,NetCashToInvestorsFromDebtSale#8623,OZVintage#8624]
ParquetRelation[file:/d:/MktLending/prp_enh1]
== Analyzed Logical Plan ==
BorrowerRate: double, MnthRate: double, ObservationMonth: date, CycleCounter:
int, LoanID: int, Loankey: string, OriginationDate: date, OriginationQuarter:
string, LoanAmount: double, Term: int, LenderRate: double, ProsperRating:
string, ScheduledMonthlyPaymentAmount: double, ChargeoffMonth: date,
ChargeoffAmount: double, CompletedMonth: date, MonthOfLastPayment: date,
PaymentsReceived: double, CollectionFees: double, PrincipalPaid: double,
InterestPaid: double, LateFees: double, ServicingFees: double,
RecoveryPayments: double, RecoveryPrin: double, DaysPastDue: int,
PriorMonthDPD: int, ExplicitRoll: string, SummaryRoll: string, CumulPrin:
double, EOMPrin: double, ScheduledPrinRemaining: double, ScheduledCumulPrin:
double, ScheduledPeriodicPrin: double, BOMPrin: double, ListingNumber: int,
DebtSaleMonth: int, GrossCashFromDebtSale: double, DebtSaleFee: double,
NetCashToInvestorsFromDebtSale: double, OZVintage: string
Filter ((LoanID#8588 = 62231) && (NOT (PaymentsReceived#8601 = cast(0 as
double)) || NOT ExplicitRoll#8611 IN (PreviouslyPaidOff,PreviouslyChargedOff)))
Relation[BorrowerRate#8584,MnthRate#8585,ObservationMonth#8586,CycleCounter#8587,LoanID#8588,Loankey#8589,OriginationDate#8590,OriginationQuarter#8591,LoanAmount#8592,Term#8593,LenderRate#8594,ProsperRating#8595,ScheduledMonthlyPaymentAmount#8596,ChargeoffMonth#8597,ChargeoffAmount#8598,CompletedMonth#8599,MonthOfLastPayment#8600,PaymentsReceived#8601,CollectionFees#8602,PrincipalPaid#8603,InterestPaid#8604,LateFees#8605,ServicingFees#8606,RecoveryPayments#8607,RecoveryPrin#8608,DaysPastDue#8609,PriorMonthDPD#8610,ExplicitRoll#8611,SummaryRoll#8612,CumulPrin#8613,EOMPrin#8614,ScheduledPrinRemaining#8615,ScheduledCumulPrin#8616,ScheduledPeriodicPrin#8617,BOMPrin#8618,ListingNumber#8619,DebtSaleMonth#8620,GrossCashFromDebtSale#8621,DebtSaleFee#8622,NetCashToInvestorsFromDebtSale#8623,OZVintage#8624]
ParquetRelation[file:/d:/MktLending/prp_enh1]
== Optimized Logical Plan ==
Filter ((LoanID#8588 = 62231) && (NOT (PaymentsReceived#8601 = 0.0) || NOT
ExplicitRoll#8611 IN (PreviouslyPaidOff,PreviouslyChargedOff)))
Relation[BorrowerRate#8584,MnthRate#8585,ObservationMonth#8586,CycleCounter#8587,LoanID#8588,Loankey#8589,OriginationDate#8590,OriginationQuarter#8591,LoanAmount#8592,Term#8593,LenderRate#8594,ProsperRating#8595,ScheduledMonthlyPaymentAmount#8596,ChargeoffMonth#8597,ChargeoffAmount#8598,CompletedMonth#8599,MonthOfLastPayment#8600,PaymentsReceived#8601,CollectionFees#8602,PrincipalPaid#8603,InterestPaid#8604,LateFees#8605,ServicingFees#8606,RecoveryPayments#8607,RecoveryPrin#8608,DaysPastDue#8609,PriorMonthDPD#8610,ExplicitRoll#8611,SummaryRoll#8612,CumulPrin#8613,EOMPrin#8614,ScheduledPrinRemaining#8615,ScheduledCumulPrin#8616,ScheduledPeriodicPrin#8617,BOMPrin#8618,ListingNumber#8619,DebtSaleMonth#8620,GrossCashFromDebtSale#8621,DebtSaleFee#8622,NetCashToInvestorsFromDebtSale#8623,OZVintage#8624]
ParquetRelation[file:/d:/MktLending/prp_enh1]
== Physical Plan ==
Filter ((LoanID#8588 = 62231) && (NOT (PaymentsReceived#8601 = 0.0) || NOT
ExplicitRoll#8611 IN (PreviouslyPaidOff,PreviouslyChargedOff)))
Scan
ParquetRelation[file:/d:/MktLending/prp_enh1][BorrowerRate#8584,MnthRate#8585,ObservationMonth#8586,CycleCounter#8587,LoanID#8588,Loankey#8589,OriginationDate#8590,OriginationQuarter#8591,LoanAmount#8592,Term#8593,LenderRate#8594,ProsperRating#8595,ScheduledMonthlyPaymentAmount#8596,ChargeoffMonth#8597,ChargeoffAmount#8598,CompletedMonth#8599,MonthOfLastPayment#8600,PaymentsReceived#8601,CollectionFees#8602,PrincipalPaid#8603,InterestPaid#8604,LateFees#8605,ServicingFees#8606,RecoveryPayments#8607,RecoveryPrin#8608,DaysPastDue#8609,PriorMonthDPD#8610,ExplicitRoll#8611,SummaryRoll#8612,CumulPrin#8613,EOMPrin#8614,ScheduledPrinRemaining#8615,ScheduledCumulPrin#8616,ScheduledPeriodicPrin#8617,BOMPrin#8618,ListingNumber#8619,DebtSaleMonth#8620,GrossCashFromDebtSale#8621,DebtSaleFee#8622,NetCashToInvestorsFromDebtSale#8623,OZVintage#8624]
Code Generation: true
{noformat}
> Boolean logic in sql does not work "not (A and B)" is not the same as "(not
> A) or (not B)"
> --------------------------------------------------------------------------------------------
>
> Key: SPARK-12218
> URL: https://issues.apache.org/jira/browse/SPARK-12218
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.5.2
> Reporter: Irakli Machabeli
> Priority: Blocker
>
> Two identical queries produce different results
> In [2]: sqlContext.read.parquet('prp_enh1').where(" LoanID=62231 and not(
> PaymentsReceived=0 and ExplicitRoll in ('PreviouslyPaidOff',
> 'PreviouslyChargedOff'))").count()
> Out[2]: 18
> In [3]: sqlContext.read.parquet('prp_enh1').where(" LoanID=62231 and (
> not(PaymentsReceived=0) or not (ExplicitRoll in ('PreviouslyPaidOff',
> 'PreviouslyChargedOff')))").count()
> Out[3]: 28
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]