Hello,
I would like to formulate a query that returns the difference between items
stored in the same row.
An example for such a query in SQL is given here
https://stackoverflow.com/questions/9994862/date-difference-between-consecutive-rows
I modified it slightly to fit my example program below.
My program is the following:
from sqlalchemy import Column, Integer, String, DateTime, Float
from sqlalchemy import create_engine, text
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from datetime import datetime as dttm
import pandas as pd
import sys
Base = declarative_base()
class CTestTable(Base):
__tablename__ = 'testTable'
id = Column(Integer, primary_key=True, index=True)
datetime = Column(DateTime(timezone=False))
val = Column(Float)
info = Column(String(100))
def __repr__(self):
return "<User(name='%s', fullname='%s', password='%s')>" % (
self.name, self.fullname, self.password)
def main():
# create engine
engine = create_engine('sqlite:///test.sqlitedb', echo=False)
# create session
Session = sessionmaker()
Session.configure(bind=engine)
session = Session()
# create tables in database
Base.metadata.drop_all(engine)
Base.metadata.create_all(engine)
# add test data
testTables = []
testTables.append(CTestTable(datetime=dttm(2018, 1, 1), val=1.1,
info='t1'))
testTables.append(CTestTable(datetime=dttm(2018, 1, 3), val=1.4,
info='t1'))
testTables.append(CTestTable(datetime=dttm(2018, 1, 4), val=1.5,
info='t1'))
testTables.append(CTestTable(datetime=dttm(2018, 1, 1), val=11.1,
info='t2'))
testTables.append(CTestTable(datetime=dttm(2018, 1, 2), val=11.4,
info='t2'))
testTables.append(CTestTable(datetime=dttm(2018, 1, 4), val=11.5,
info='t2'))
session.add_all(testTables)
session.commit()
# print the table
qs = (session.query(CTestTable))
df = pd.read_sql(qs.statement, qs.session.bind)
print(df)
# applying SQL text:
ttext = (f"""select id,
datetime,
val,
info,
julianday("datetime") - julianday("PreviousDate")
as datediff
from ( select id,
datetime,
val,
info,
(select max(datetime)
from testTable T2
where T2.info=T1.info
and T2.datetime < T1.datetime
) as PreviousDate
from testTable T1
) as T""")
stmt = text(ttext)
df = pd.read_sql(stmt, session.bind)
print(df)
if __name__ == '__main__':
sys.exit(main())
The output is the following:
id datetime val info
0 1 2018-01-01 1.1 t1
1 2 2018-01-03 1.4 t1
2 3 2018-01-04 1.5 t1
3 4 2018-01-01 11.1 t2
4 5 2018-01-02 11.4 t2
5 6 2018-01-04 11.5 t2
id datetime val info datediff
0 1 2018-01-01 00:00:00.000000 1.1 t1 NaN
1 2 2018-01-03 00:00:00.000000 1.4 t1 2.0
2 3 2018-01-04 00:00:00.000000 1.5 t1 1.0
3 4 2018-01-01 00:00:00.000000 11.1 t2 NaN
4 5 2018-01-02 00:00:00.000000 11.4 t2 1.0
5 6 2018-01-04 00:00:00.000000 11.5 t2 2.0
The second table is the desired output.
As you can see, the SQL works perfectly ok (for sqlite) (aside from
performance issues on tables with >1. e7 rows)
But what is the way to perform this using sqlalchemy's ORM structures?
Somehow using alias(), select(), nested subqueries...
But how do I do that?
One expected benefit would be, of course, that the ORM way would be
database-backend independent (the backend could be changed to mysql or
other, the program would still run).
Thanks for any help!
--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
http://www.sqlalchemy.org/
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable
Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.