Re: Assertion of return value of dataframe in pytest

Mich Talebzadeh Wed, 03 Feb 2021 08:51:12 -0800

It appears that the following assertion works assuming that result set can
be = 0 (no data) or > 0 there is data


assert df2.count() >= 0

However, if I wanted to write to a JDBC database from PySpark through a
function (already defined in another module) as below


def writeTableToOracle(dataFrame,mode,dataset,tableName):

    try:

        dataFrame. \

            write. \

            format("jdbc"). \

            option("url", oracle_url). \

            option("dbtable", tableName). \

            option("user", config['OracleVariables']['oracle_user']). \

            option("password",
config['OracleVariables']['oracle_password']). \

            option("driver", config['OracleVariables']['oracle_driver']). \

            mode(mode). \

            save()

    except Exception as e:

        print(f"""{e}, quitting""")

        sys.exit(1)


and call it in the program


from sparkutils import sparkstuff as s

s.writeTableToOracle(df2,"overwrite",config['OracleVariables']['dbschema'],config['OracleVariables']['yearlyAveragePricesAllTable'])


How can one assert its validity in PyTest?


Thanks again

On Wed, 3 Feb 2021 at 15:12, Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Hi,
>
> In Pytest you want to ensure that the composed DF has the correct return.
>
> Example
>
>     df2 = house_df. \
>         select( \
>         F.date_format('datetaken', 'yyyy').cast("Integer").alias('YEAR') \
>         , 'REGIONNAME' \
>         ,
> round(F.avg('averageprice').over(wSpecY)).alias('AVGPRICEPERYEAR') \
>         ,
> round(F.avg('flatprice').over(wSpecY)).alias('AVGFLATPRICEPERYEAR') \
>         ,
> round(F.avg('TerracedPrice').over(wSpecY)).alias('AVGTERRACEDPRICEPERYEAR')
> \
>         ,
> round(F.avg('SemiDetachedPrice').over(wSpecY)).alias('AVGSDPRICEPRICEPERYEAR')
> \
>         ,
> round(F.avg('DetachedPrice').over(wSpecY)).alias('AVGDETACHEDPRICEPERYEAR')).
> \
>         distinct().orderBy('datetaken', asending=True)
>
> Will that be enough to run just this command
>
>   assert not []
>
> I believe that may be flawed because any error will be assumed to be NOT
> NULL?
>
> Thanks
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>

Re: Assertion of return value of dataframe in pytest

Reply via email to